Lightnews — Scholar-powered news

Katie Keith

@katakeith.bsky.social

NLP and computational social science (CSS) researcher. Assistant Professor in Computer Science at Williams College. AI2 and UMass Amherst alum. she/her. https://kakeith.github.io/

Posts Replies Media Videos

Katie Keith

@katakeith.bsky.social

Whoa...!! If social-science leaning at all maybe try other preprint servers? SocArXiv for example? We put one of our preprints there: osf.io/preprints/so...

August 27, 2025 at 7:02 PM

Katie Keith

@katakeith.bsky.social

Yes! I agree. It's so rare these days to see a keynote that is so thorough and full of new conceptualizations.

August 12, 2025 at 2:12 AM

Katie Keith

@katakeith.bsky.social

Under review! Happy to share a draft if you email me. Thanks!

July 23, 2025 at 7:14 PM

Katie Keith

@katakeith.bsky.social

Thanks:)

July 23, 2025 at 2:39 PM

Katie Keith

@katakeith.bsky.social

Not as recent, but still LLM-based

"WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation." GPT-3 composes new examples with similar patterns to challenging examples.

aclanthology.org/2022.finding...

aclanthology.org

July 23, 2025 at 1:05 PM

Katie Keith

@katakeith.bsky.social

I thought this was a clever and useful paper from Xiong, ... Hovy, El-Assady, Ash "Co-DETECT: Collaborative Discovery of Edge Cases in Text Classification." Using LLMs to help humans refine their codebooks (before codebooks are fixed for the true annotation stage) arxiv.org/pdf/2507.05010

arxiv.org

July 23, 2025 at 1:00 PM

Katie Keith

@katakeith.bsky.social

We used active learning to create a human-annotated dataset of 1050 instances from FOMC transcripts—labeled for FOMC members’ opinions and directional stance towards monetary policy. Preprint and dataset should be released publicly by the end of the summer but email me for an advanced copy.

July 23, 2025 at 12:53 PM

Katie Keith

@katakeith.bsky.social

Yay! I'm there as well. Let's sync up.

July 20, 2025 at 11:31 AM

Katie Keith

@katakeith.bsky.social

Personally, I find I have to burn a day answering all the questions (particularly for a dataset release). I think it should be condensed to the 5 most important ones.

May 20, 2025 at 6:27 PM

Katie Keith

@katakeith.bsky.social

Our semi-synthetic experiments use MIIMIC-III clinical notes and two open-weight LLMs and show that our method produces estimates with low bias.

December 11, 2024 at 1:10 AM

Katie Keith

@katakeith.bsky.social

For settings with an unobserved (but known) confounding variable, we propose a new causal inference method that uses two instances of pre-treatment text data, infers two proxies using two zero-shot models on the separate instances, and applies these proxies in the proximal g-formula.

December 11, 2024 at 1:10 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news