Clara Na
clarana.bsky.social
Clara Na
@clarana.bsky.social
PhD student @ CMU LTI. efficiency/data in NLP/ML
Pinned
Building/customizing your own LLM? You'll want to curate training data for it, but how do you know what makes the data good?
You can try out recipes👩‍🍳 iterate on ✨vibes✨ but we can't actually test all possible combos of tweaks,,, right?? 🙅‍♂️WRONG! arxiv.org/abs/2410.15661 (1/n) 🧵
Reposted by Clara Na
We’re excited about Oolong as a challenging benchmark for information aggregation! Let us know which models we should benchmark next 👀

Paper: arxiv.org/abs/2511.02817
Dataset: huggingface.co/oolongbench
Code: github.com/abertsch72/o...
Leaderboard: oolongbench.github.io
Oolong: Evaluating Long Context Reasoning and Aggregation Capabilities
As model context lengths continue to grow, concerns about whether models effectively use the full context length have persisted. While several carefully designed long-context evaluations have recently...
arxiv.org
November 7, 2025 at 5:07 PM
Reposted by Clara Na
Can LLMs accurately aggregate information over long, information-dense texts? Not yet…

We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!
November 7, 2025 at 5:07 PM
Yes! tbh this method is probably much more immediately useful for helping one understand subtle differences between [models trained on] subtly different data subsets, vs a loftier goal of helping one find "the" best data mixture -- to anyone considering this method, please feel free to reach out :)
The method in this paper was designed to find an optimal data mixture. But researchers in the human sciences who are training models *in order to understand the effect of the data* might also consider this as a clever way of evaluating hundreds of subsets without training hundreds of models. #MLSky
May 6, 2025 at 4:16 AM
Come through! #492 in Hall 2!, 10am-12:30pm
April 26, 2025 at 1:59 AM
Reposted by Clara Na
Our paper documenting the environmental impacts of creating OLMo language models is the most honest and comprehensive characterization I know of, including training, development (!) and inference costs. If you're at ICLR chat with @jacobcares.bsky.social & @clarana.bsky.social Sat morning 10-12:30!
April 25, 2025 at 1:14 PM
Reposted by Clara Na
I'm in Singapore for @iclr-conf.bsky.social ! Come check out our spotlight paper on the environmental impact of training OLMo (link in next tweet) during the Saturday morning poster session from 10-12:30 -- happy to chat about this or anything else! DMs should be open, email works too
April 23, 2025 at 3:22 PM
Reposted by Clara Na
We've received multiple notes that NOAA research services (Office of Oceanic and Atmospheric Research) may go offline at midnight. @safeguardingdata.bsky.social is working on web archiving, but if others want to nominate on this, that might be good: digital2.library.unt.edu/nomination/G...
Nomination Tool: Project URL Nomination
digital2.library.unt.edu
April 3, 2025 at 9:36 PM
Reposted by Clara Na
How can we better think and talk about human-like qualities attributed to language technologies like LLMs? In our #CHI2025 paper, we taxonomize how text outputs from cases of user interactions with language technologies can contribute to anthropomorphism. arxiv.org/abs/2502.09870 1/n
March 6, 2025 at 3:43 AM
Reposted by Clara Na
Did you know? Gestures used to express universal concepts—like wishing for luck—vary DRAMATICALLY across cultures?
🤞means luck in US but deeply offensive in Vietnam 🚨

📣 We introduce MC-SIGNS, a test bed to evaluate how LLMs/VLMs/T2I handle such nonverbal behavior!

📜: arxiv.org/abs/2502.17710
February 26, 2025 at 4:23 PM
Reposted by Clara Na
the science of LMs should be fully open✨

today @akshitab.bsky.social @natolambert.bsky.social and I are giving our #neurips2024 tutorial on language model development.

everything from data, training, adaptation. published or not, no secrets 🫡

tues, 12/10, 9:30am PT ☕️

neurips.cc/virtual/2024...
NeurIPS Tutorial Opening the Language Model Pipeline: A Tutorial on Data Preparation, Model Training, and AdaptationNeurIPS 2024
neurips.cc
December 10, 2024 at 3:31 PM
Reposted by Clara Na
How open is “open” AI, really?
It isn’t just about making models reusable. If the origin of data is opaque, if labor is hidden & exploited, if frameworks are dominated by Big Tech, if computational power is mastered by an oligopoly…‘open’ is just a label.

Meredith Whittaker & friends in Nature.
December 3, 2024 at 5:49 PM
Reposted by Clara Na
I noticed a lot of starter packs skewed towards faculty/industry, so I made one of just NLP & ML students: go.bsky.app/vju2ux

Students do different research, go on the job market, and recruit other students. Ping me and I'll add you!
November 23, 2024 at 7:54 PM
Reposted by Clara Na
💬 Have you or a loved one compared LM probabilities to human linguistic acceptability judgments? You may be overcompensating for the effect of frequency and length!
🌟 In our new paper, we rethink how we should be controlling for these factors 🧵:
November 20, 2024 at 6:08 PM
November 14, 2024 at 3:30 PM
Hi I am at 232 in the back of the riverfront room!
November 14, 2024 at 3:28 PM
I'm at EMNLP! Presenting the poster for this paper on Thursday morning (10:30-12), Session F Riverfront Hall, come say hi :)
Building/customizing your own LLM? You'll want to curate training data for it, but how do you know what makes the data good?
You can try out recipes👩‍🍳 iterate on ✨vibes✨ but we can't actually test all possible combos of tweaks,,, right?? 🙅‍♂️WRONG! arxiv.org/abs/2410.15661 (1/n) 🧵
November 13, 2024 at 3:08 PM
Reposted by Clara Na
(Hehe first bsky post!) I'll be at #EMNLP2024 💃🌴! Happy to chat about (among other things):
✨linguistically+cognitively motivated evaluation
✨NLP for low-resource+endangered languages
✨figuring out what features of language data LMs are *actually* learning
I'll be presenting two posters 🧵:
November 8, 2024 at 6:39 PM
scrolling,,, minimal doom ?!
November 9, 2024 at 12:58 AM
Reposted by Clara Na
Understanding “Democratization” in NLP and ML Research - joint work @arjunsubgraph.bsky.social and I co-led with Dietrich Klakow and @zeerak.bsky.social
aclanthology.org/2024.emnlp-m...
Understanding “Democratization” in NLP and ML Research
Arjun Subramonian, Vagrant Gautam, Dietrich Klakow, Zeerak Talat. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024.
aclanthology.org
November 8, 2024 at 11:23 PM
Reposted by Clara Na
A starter pack for #NLP #NLProc researchers! 🎉

go.bsky.app/SngwGeS
November 4, 2024 at 10:01 AM
Building/customizing your own LLM? You'll want to curate training data for it, but how do you know what makes the data good?
You can try out recipes👩‍🍳 iterate on ✨vibes✨ but we can't actually test all possible combos of tweaks,,, right?? 🙅‍♂️WRONG! arxiv.org/abs/2410.15661 (1/n) 🧵
November 5, 2024 at 10:37 PM
Reposted by Clara Na
I think it’s fucked up that EMNLP 2023 emailed Findings authors on Nov 8 that they *might* have a chance to present at main conf, but also don’t forget to early register by Nov 12. Then only let authors know of *virtual* poster assignment 10 min before early registration closed.
November 13, 2023 at 8:18 AM
Reposted by Clara Na
Not at all surprised to see that junior people support the proposed anonymity changes to the ACL policies.

Speaking for myself and my "early career" goals, the anonymity deadlines are incredibly stressful and (as far as I can tell) not beneficial to me.
ACL anonymity working group
UKP-Cloud - The place for your files @ UKP Lab!
nextcloud.ukp.informatik.tu-darmstadt.de
November 13, 2023 at 3:30 PM
Reposted by Clara Na
By learning our history, rather than exceptionalizing the current moment, it's easy to discover worthwhile directions for researchers interested in contributing to language model capabilities without access to industry-scale training. Enjoy your research!
November 10, 2023 at 3:15 PM