Yu Lu Liu
liuyulu.bsky.social
Yu Lu Liu
@liuyulu.bsky.social
PhD student at Johns Hopkins University
Alumni from McGill University & MILA
Working on NLP Evaluation, Responsible AI, Human-AI interaction
she/her 🇨🇦
Pinned
Human-centered Evalulation and Auditing of Language models (HEAL) workshop is back for #CHI2025, with this year's special theme: “Mind the Context”! Come join us on this bridge between #HCI and #NLProc!

Workshop submission deadline: Feb 17 AoE
More info at heal-workshop.github.io.
Reposted by Yu Lu Liu
This was accepted to #NeurIPS 🎉🎊

TL;DR Impoverished notions of rigor can have a formative impact on AI work. We argue for a broader conception of what rigorous work should entail & go beyond methodological issues to include epistemic, normative, conceptual, reporting & interpretative considerations
We have to talk about rigor in AI work and what it should entail. The reality is that impoverished notions of rigor do not only lead to some one-off undesirable outcomes but can have a deeply formative impact on the scientific integrity and quality of both AI research and practice 1/
September 29, 2025 at 11:13 PM
Reposted by Yu Lu Liu
We are excited to kick off the 2nd HEAL workshop tomorrow at #CHI2025. Dr. Su Lin Blodgett and Dr. Gagan Bansal from MSR will be our keynote speakers!

Welcome new and old friends! See you at G221!

All accepted papers: tinyurl.com/bdfpjcr4
April 25, 2025 at 11:28 AM
Reposted by Yu Lu Liu
Bringing together our incredible current and admitted students—future leaders, innovators, and changemakers!
March 7, 2025 at 5:15 AM
📣 DEADLINE EXTENSION 📣

By popular request, HEAL workshop submission deadline is extended to Feb 24 AOE!

Reminder that we welcome a wide range of submissions: position papers, lit reviews, encore of published work, etc.

Looking forward to your submissions!
Human-centered Evalulation and Auditing of Language models (HEAL) workshop is back for #CHI2025, with this year's special theme: “Mind the Context”! Come join us on this bridge between #HCI and #NLProc!

Workshop submission deadline: Feb 17 AoE
More info at heal-workshop.github.io.
February 13, 2025 at 10:05 PM
Reposted by Yu Lu Liu
Thrilled that our paper Faux Polyglot has been accepted to #NAACL2025 main! 🚀
We show that multilingual RAG creates language-specific information cocoons and amplifies perspectives and facts in the dominant language, especially when handling knowledge conflicts.
📜 arxiv.org/abs/2407.05502
Faux Polyglot: A Study on Information Disparity in Multilingual Large Language Models
With Retrieval Augmented Generation (RAG), Large Language Models (LLMs) are playing a pivotal role in information search and are being adopted globally. Although the multilingual capability of LLMs of...
arxiv.org
January 31, 2025 at 3:19 PM
The submission deadline is in less than a month! We welcome encore submissions, so consider submitting your work regardless of whether it's been accepted or not #chi2025 😉
Human-centered Evalulation and Auditing of Language models (HEAL) workshop is back for #CHI2025, with this year's special theme: “Mind the Context”! Come join us on this bridge between #HCI and #NLProc!

Workshop submission deadline: Feb 17 AoE
More info at heal-workshop.github.io.
January 22, 2025 at 3:32 PM
Human-centered Evalulation and Auditing of Language models (HEAL) workshop is back for #CHI2025, with this year's special theme: “Mind the Context”! Come join us on this bridge between #HCI and #NLProc!

Workshop submission deadline: Feb 17 AoE
More info at heal-workshop.github.io.
December 16, 2024 at 10:07 PM
Reposted by Yu Lu Liu
Super excited to announce that @msftresearch.bsky.social's FATE group, Sociotechnical Alignment Center, and friends have several workshop papers at next week's @neuripsconf.bsky.social. A short thread about (some of) these papers below... #NeurIPS2024
December 2, 2024 at 11:02 PM
Reposted by Yu Lu Liu
📣 📣 Interested in an internship on human-centred AI, human agency, AI evaluation & the impacts of AI systems? Our team/FATE MLT (Su Lin Blodgett, @qveraliao.bsky.social & I) is looking for a few summer interns 🎉 Apply by Jan 10 for full consideration: jobs.careers.microsoft.com/global/en/jo...
December 5, 2024 at 8:11 PM
Reposted by Yu Lu Liu
Seeing cool works on metrology and measurement modeling for NLP!

So I wanted to port over the thread our ACL 2023 Findings paper (arxiv.org/abs/2305.09022) on conceptualizations of NLP tasks and measurements of performance! Work with Eric Yuan, @haldaume3.bsky.social, and Su Lin Blodgett. (1/n)
It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and Measurements of Performance
Progress in NLP is increasingly measured through benchmarks; hence, contextualizing progress requires understanding when and why practitioners may disagree about the validity of benchmarks. We develop...
arxiv.org
December 4, 2024 at 6:37 PM
Reposted by Yu Lu Liu
I am collecting examples of the most thoughtful writing about generative AI published in 2024. What’s yours? They can be insightful for commentary, smart critique, or just because it shifted the conversation. I’ll post some of mine below as I go through them. #criticalAI
December 2, 2024 at 4:09 AM
Reposted by Yu Lu Liu
Created a small starter pack including folks whose work I believe contributes to more rigorous and grounded AI research -- I'll grow this slowly and likely move it to a list at some point :) go.bsky.app/P86UbQw
November 30, 2024 at 7:58 PM
Reposted by Yu Lu Liu
Hi, so I've spent the past almost-decade studying research uses of public social media data, like e.g. ML researchers using content from Twitter, Reddit, and Mastodon.

Anyway, buckle up this is about to be a VERY long thread with lots of thoughts and links to papers. 🧵
First dataset for the new @huggingface.bsky.social @bsky.app community organisation: one-million-bluesky-posts 🦋

📊 1M public posts from Bluesky's firehose API
🔍 Includes text, metadata, and language predictions
🔬 Perfect to experiment with using ML for Bluesky 🤗

huggingface.co/datasets/blu...
bluesky-community/one-million-bluesky-posts · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
November 27, 2024 at 3:31 PM
Reposted by Yu Lu Liu
Had a lot of fun teaching a tutorial on Human-Centered Evaluation of Language Technologies at #EMNLP2024, w/ @ziangxiao.bsky.social, Su Lin Blodgett, and Jackie Cheung

We just posted the slides on our tutorial website: human-centered-eval.github.io
Human-Centered Eval@EMNLP24
human-centered-eval.github.io
November 26, 2024 at 8:55 PM
Reposted by Yu Lu Liu
🚨 NeurIPS 2024 Spotlight
Did you know we lack standards for AI benchmarks, despite their role in tracking progress, comparing models, and shaping policy? 🤯 Enter BetterBench–our framework with 46 criteria to assess benchmark quality: betterbench.stanford.edu 1/x
November 25, 2024 at 7:02 PM
Reposted by Yu Lu Liu
It turns out we had even more papers at EMNLP!

Let's complete the list with three more🧵
Our lab members recently presented 3 papers at @emnlpmeeting.bsky.social in Miami ☀️ 📜

From interpretability to bias/fairness and cultural understanding -> 🧵
November 24, 2024 at 2:17 AM
Reposted by Yu Lu Liu
Our lab members recently presented 3 papers at @emnlpmeeting.bsky.social in Miami ☀️ 📜

From interpretability to bias/fairness and cultural understanding -> 🧵
November 23, 2024 at 8:35 PM
Reposted by Yu Lu Liu
my first post, now that I am here with my 500+ closest friends 🙂 -- here is a tiny owl 🦉 I met some weeks back in the big apple 🍎 (picture by @sbucur.bsky.social)
November 22, 2024 at 11:19 PM
Reposted by Yu Lu Liu
McGill NLP just landed on this blue planet

bsky.app/profile/mcgi...
bsky.app
November 22, 2024 at 5:17 PM
The starter pack just surpassed 1/3 of its capacity! Don't be shy to reach out to me if you are a researcher in this area, or if you have suggestions. Thank you 🥰
I’m putting together a starter pack for researchers working on human-centered AI evaluation. Reply or DM me if you’d like to be added, or if you have suggestions! Thank you!

(It looks NLP-centric at the moment, but that’s due to the current limits of my own knowledge 🙈)

go.bsky.app/G3w9LpE
November 23, 2024 at 1:10 AM
Reposted by Yu Lu Liu
I didn’t expect to wind up in the news over this but in hindsight, I guess it makes sense lol.

This is the first time I’ve been in the Herald since high school 😂.
November 20, 2024 at 3:17 AM
Reposted by Yu Lu Liu
“We argue that societal impacts [of GenAI] should be conceptualised as application- and context-specific, incommensurable, and shaped by questions of social power.” By @glenberman.bsky.social et al. arxiv.org/abs/2410.22985
Troubling Taxonomies in GenAI Evaluation
To evaluate the societal impacts of GenAI requires a model of how social harms emerge from interactions between GenAI, people, and societal structures. Yet a model is rarely explicitly defined in soci...
arxiv.org
November 22, 2024 at 4:04 AM
I’m putting together a starter pack for researchers working on human-centered AI evaluation. Reply or DM me if you’d like to be added, or if you have suggestions! Thank you!

(It looks NLP-centric at the moment, but that’s due to the current limits of my own knowledge 🙈)

go.bsky.app/G3w9LpE
November 21, 2024 at 3:56 PM
Reposted by Yu Lu Liu
Putting together a JHU Center for Language and Speech Processing starter pack!

Please reply or DM me if you're doing research at CLSP and would like to be added - I'm still trying to find out which of us are on here so far.

go.bsky.app/JtWKca2
CLSP
Join the conversation
go.bsky.app
November 19, 2024 at 3:37 PM