Lucy Li
banner
lucy3.bsky.social
Lucy Li
@lucy3.bsky.social
Postdoc at UW NLP 🏔️. #NLProc, computational social science, cultural analytics, responsible AI. she/her. Previously at Berkeley, Ai2, MSR, Stanford. Incoming assistant prof at Wisconsin CS. lucy3.github.io
Reposted by Lucy Li
I had some fun pulling OpenAI's mission statement out of their IRS tax filings from 2016 to 2024, loading them into a git repo with fake commit dates and then taking a look at the diffs simonwillison.net/2026/Feb/13/...
The evolution of OpenAI’s mission statement
As a USA 501(c)(3) the OpenAI non-profit has to file a tax return each year with the IRS. One of the required fields on that tax return is to “Briefly …
simonwillison.net
February 13, 2026 at 11:40 PM
Reposted by Lucy Li
I doubt it. I would read the author's piece very literally. He just put this preprint on arxiv: arxiv.org/pdf/2601.19062 I think some (and my read, this includes the author) are realizing that much more than AI is disempowering us. Many of us have known this for a very long time, of course.
arxiv.org
February 12, 2026 at 5:32 AM
Reposted by Lucy Li
I wrote a short article on AI Model Evaluation for the Open Encyclopedia of Cognitive Science 📕👇

Hope this is helpful for anyone who wants a super broad, beginner-friendly intro to the topic!

Thanks @mcxfrank.bsky.social and @asifamajid.bsky.social for this amazing initiative!
February 12, 2026 at 10:22 PM
Reposted by Lucy Li
Well done @zdenekkasner.bsky.social et al!

LLMs as Span Annotators: A Comparative Study of LLMs and Humans is accepted to multilingual-multicultural-evaluation.github.io 🎉

See paper arxiv.org/abs/2504.08697
January 29, 2026 at 3:35 PM
Reposted by Lucy Li
If you think labeling text spans with LLMs is easy, you probably have not tried it yourself (we have! 🙃).

Any method you can think of – be it tagging, matching, or indexing – has flaws.

In our new preprint, we tested them all 💪We also proposed how to improve one of them.

arxiv.org/abs/2601.16946
January 29, 2026 at 2:20 PM
Reposted by Lucy Li
I am looking for 2 emergency reviewers for the ARR Ethics, Bias & Fairness track. Please DM me if you are available 🙏
February 10, 2026 at 9:27 AM
Reposted by Lucy Li
Recent publications arguing against the use of genAI in reflexive qual research inspired us (Elida Ibrahim and @andreavoyer.bsky.social) to write our own perspective. Not to convince anyone to use genAI but for those who might be interested and are looking for guidance.

osf.io/preprints/so...
February 9, 2026 at 6:49 PM
Reposted by Lucy Li
Bad Bunny's historical advisor is an assistant professor at UW-Madison.

Hell of a flex for your tenure file.
Not at all surprised to learn that Bad Bunny has a historical adviser. His halftime show was a reminder that our history and culture are deeply intertwined with the rest of the western hemisphere. We should think of his performance as part of #America250. #SuperBowl
news.wisc.edu/pop-star-bad...
Pop star Bad Bunny needed a Puerto Rican history scholar. UW–Madison had just the one.
Bad Bunny collaborated with UW–Madison history professor Jorell Meléndez-Badillo on Puerto Rican narratives that accompany the new album “DeBÍ TiRAR MáS FOToS.”
news.wisc.edu
February 9, 2026 at 1:47 PM
Reposted by Lucy Li
Excited to be co-organizing the #CHI2026 workshop on augmented reading interfaces 📚✨ Submissions are open for one more week! We want to know what you're working on!
The #CHI2026 workshop on augmented reading interfaces is accepting submissions for one more week. We hope you consider formulating your perspective and sending it in! Can't wait to see y'all Barcelona to talk about enriching experiences with written information.

chi-star-workshop.github.io
February 6, 2026 at 8:21 PM
Reposted by Lucy Li
Really sad to hear that First Monday is shutting down after 30 years. It was one of the first journals devoted to internet research & fully open access: no fees, no paywalls, and authors retained copyright.

My very first publication was there in 2004. End of an era.

firstmonday.org/ojs/index.ph...
First Monday @ 30 | First Monday
firstmonday.org
February 6, 2026 at 9:28 PM
Reposted by Lucy Li
PhD admissions visits/open houses are starting to happen, and I got a comment on an old Reddit post where I was offering advice, and realized that it's actually really good advice. So here it is! (And this applies whether you've already been admitted to the program or not.) 🧵
February 5, 2026 at 5:26 PM
Reposted by Lucy Li
I've always been a fan of what the Allen Institute is doing. New in Nature: OpenScholar, an 8B RAG model for scientific literature, outperforms GPT-4o by 6% on correctness. Experts preferred its answers over human-written ones 51%-70 of the time. www.nature.com/articles/s41... 1/3 🧵
February 5, 2026 at 1:24 PM
Reposted by Lucy Li
Nearly 2 years ago, @jessyjli.bsky.social, @janetlauyeung.bsky.social, @valentinapy.bsky.social, and I decided that it's time to bring discourse structure to the center of NLP teaching.
February 5, 2026 at 3:53 AM
When I go to the dentist / doctor, they ask me if I'm in school or going to school and my answer has always been yes
I am a middle aged college professor. My mom does this every semester. And it is one of very favorite things.
February 2, 2026 at 9:54 PM
Reposted by Lucy Li
🎭 How do LLMs (mis)represent culture?
🧮 How often?
🧠 Misrepresentations = missing knowledge? spoiler: NO!

At #CHI2026 we are bringing ✨TALES✨ a participatory evaluation of cultural (mis)reps & knowledge in multilingual LLM-stories for India

📜 arxiv.org/abs/2511.21322

1/10
February 2, 2026 at 9:38 PM
Reposted by Lucy Li
🚀 Apply to CMU LTI’s Summer 2026 “Language Technology for All” internship! 🎓 Open to pre‑doctoral students new to language tech (non‑CS backgrounds welcome). 🔬 12–14 weeks in‑person in Pittsburgh — travel + stipend paid. 💸 Deadline: Feb 20, 11:59pm ET. Apply → forms.gle/cUu8g6wb27Hs...
CMU LTI Summer 2026 Internship Program Application
We are looking for applicants for the Carnegie Mellon University Language Technology Institute's Summer 2026 "Language Technology for All" internship program. The main goal of this internship is to pr...
forms.gle
February 2, 2026 at 3:41 PM
Reposted by Lucy Li
As these teens describe, AI can diminish human relationships; devalue art; threaten the environment; lead to laziness; give unreliable results; pose privacy concerns; and be misused.

So, please, stop with the narratives of inevitability and let's embrace a pedagogy and politics of refusal.
7 Reasons Teens Say No to AI
Some young people only turn to artificial-intelligence chatbots as a last resort, citing concerns about relationships, creativity, the environment and more.
www.wsj.com
February 1, 2026 at 10:24 PM
Reposted by Lucy Li
If you're reading about Epstein and his friends today, could you also take time to read this document?

I guarantee that someone at your university right now is a sexual harasser and/or abuser. Maybe you could stop them.
Sexual harassment is a horrible impediment to academic research, shutting out talented researchers and slowing scientific progress.

What can we do? I believe we're not helpless; we can improve our communities through practical actions.

Take a look: github.com/maria-antoni...
GitHub - maria-antoniak/fight-harassment-in-research
Contribute to maria-antoniak/fight-harassment-in-research development by creating an account on GitHub.
github.com
February 1, 2026 at 5:42 PM
Reposted by Lucy Li
Somehow before becoming a prof I came away with the impression that grant writing was an annoying task profs have to do, and yes, more rejection sucks, but it is wonderful to start new collaborations with super smart people, brainstorm hard, and think on a larger scale than next few papers
January 30, 2026 at 8:59 PM
Reposted by Lucy Li
🚨 New Study 🚨

@arxiv.bsky.social has recently decided to prohibit any 'position' paper from being submitted to its CS servers.
Why? Because of the "AI slop", and allegedly higher ratios of LLM-generated content in review papers, compared to non-review papers.
January 29, 2026 at 2:00 PM
Reposted by Lucy Li
“Position paper” is just a label to make some kinds of interdisciplinary theoretical work fit into the CS publishing schema.
In addition, when considering different subfields, we find striking differences in how we classify 'position' papers, leading to huge differences in how this policy affects different CS subfields.
January 29, 2026 at 4:42 PM
Reposted by Lucy Li
📢 CSCW is piloting a rolling submissions experiment.

🔸 More flexibility for authors
🔸 Emphasis on paper quality over deadline pressure

🧪 Important: Only a small number of papers will be invited to participate, and authors can nominate their work for possible selection.
cscw.acm.org/2026/rolling...
CSCW 2026
cscw.acm.org
January 29, 2026 at 12:12 AM
Reposted by Lucy Li
"We analyze all papers published at ACL, NAACL, and EMNLP in 2024 and 2025... nearly 300 papers contain at least one HalluCitation... Notably, half of these papers were identified at EMNLP 2025 ... indicating that this issue is rapidly increasing."

https://www.arxiv.org/abs/2601.18724
HalluCitation Matters: Revealing the Impact of Hallucinated References with 300 Hallucinated Papers in ACL Conferences
Recently, we have often observed hallucinated citations or references that do not correspond to any existing work in papers under review, preprints, or published papers. Such hallucinated citations pose a serious concern to scientific reliability. When they appear in accepted papers, they may also negatively affect the credibility of conferences. In this study, we refer to hallucinated citations as "HalluCitation" and systematically investigate their prevalence and impact. We analyze all papers published at ACL, NAACL, and EMNLP in 2024 and 2025, including main conference, Findings, and workshop papers. Our analysis reveals that nearly 300 papers contain at least one HalluCitation, most of which were published in 2025. Notably, half of these papers were identified at EMNLP 2025, the most recent conference, indicating that this issue is rapidly increasing. Moreover, more than 100 such papers were accepted as main conference and Findings papers at EMNLP 2025, affecting the credib
www.arxiv.org
January 28, 2026 at 5:40 PM
Reposted by Lucy Li
Hello!

This is a reminder that @cornelltech.bsky.social runs a Red Team Clinic that provides a *free* safety consultation to nonprofits / public sector orgs that are developing a public-facing AI tool and want to stress-test it for possible abuse vectors.

Applications welcome on a rolling basis:
‘Red team’ students stress-test NYC health department’s AI | Cornell Chronicle
People usually strive to be their true, authentic selves, but this fall, five master’s students at Cornell Tech adopted not only alter egos but also “bad intent,” in an effort to make AI safer for hea...
news.cornell.edu
January 28, 2026 at 9:33 PM
Reposted by Lucy Li
Demographic cues (eg, names, dialect) are widely used to study how LLM behavior may change depending on user demographics. Such cues are often assumed interchangeable.

🚨 We show they are not: different cues yield different model behavior for the same group and different conclusions on LLM bias. 🧵👇
January 27, 2026 at 1:07 PM