Ana Marasović
banner
anamarasovic.bsky.social
Ana Marasović
@anamarasovic.bsky.social
Asst prof @ University of Utah · NLP · she/her 🇭🇷
Luca Guadagnino making an OpenAI movie, what's happening 👀👀
November 16, 2025 at 9:07 PM
I didn't submit to ICLR, but I'm pretty sure next week I'll see similar quality issues in ARR

I feel like execs of major ML/AI conferences from ICLR/NeurIPS/ICML/AAAI, ACL/EMNLP, to CVPR should sit together and figure out a whole new strategy moving forward like 👇
Sort of starting to believe that we really do need academic metrics that punish publishing too much
November 14, 2025 at 5:28 PM
When your husband is also an academic, so when you can't get him on a phone you shoot an email and it works every time 😂💀
November 13, 2025 at 1:10 AM
Reposted by Ana Marasović
We're surveying researchers about name changes in academic publishing.

If you've changed your name and dealt with updating publications, we want to hear your experience. Any reason counts: transition, marriage, cultural reasons, etc.

forms.cloud.microsoft/e/E0XXBmZdEP
October 21, 2025 at 12:45 PM
@mclemcrew.bsky.social's CoLM spotlight is now available on YT! 🎵

youtu.be/w6LNmADnlNw?...
MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing
YouTube video by Conference on Language Modeling
youtu.be
November 11, 2025 at 10:41 PM
Reposted by Ana Marasović
*Urgently* looking for emergency reviewers for the ARR October Interpretability track 🙏🙏

ReSkies much appreciated
November 11, 2025 at 10:29 AM
Reposted by Ana Marasović
Outstanding paper (5/7):

"Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps"
by Martin Tutek, Fateme Hashemi Chaleshtori, Ana Marasovic, and Yonatan Belinkov
aclanthology.org/2025.emnlp-m...

6/n
Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps
Martin Tutek, Fateme Hashemi Chaleshtori, Ana Marasovic, Yonatan Belinkov. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.
aclanthology.org
November 7, 2025 at 10:32 PM
𝙒𝙚'𝙧𝙚 𝙝𝙞𝙧𝙞𝙣𝙜 𝙣𝙚𝙬 𝙛𝙖𝙘𝙪𝙡𝙩𝙮 𝙢𝙚𝙢𝙗𝙚𝙧𝙨!

KSoC: utah.peopleadmin.com/postings/190... (AI broadly)

Education + AI:
- utah.peopleadmin.com/postings/189...
- utah.peopleadmin.com/postings/190...

Computer Vision:
- utah.peopleadmin.com/postings/183...
November 7, 2025 at 11:35 PM
Reposted by Ana Marasović
🎉 Congratulations to all #EMNLP2025 award winners 🎉

Starting with the ✨Best Paper award ✨:

"Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index"
by Hao Xu, Jiacheng Liu, Yejin Choi, Noah A. Smith, and Hannaneh Hajishirzi
aclanthology.org/2025.emnlp-m...

1/n
November 7, 2025 at 10:29 PM
Thrilled to see this work recognized at #EMNLP2025!

This framework and approach to measuring CoT faithfulness have been hugely influential for how I think about reasoning evaluation, and I'm so lucky to have worked with such brilliant collaborators. Huge credit to @mtutek.bsky.social
Very honored to be one out of seven outstanding papers at this years' EMNLP :)

Huge thanks to my amazing collaborators @fatemehc.bsky.social @anamarasovic.bsky.social @boknilev.bsky.social , this would not have been possible without them!
November 7, 2025 at 4:56 PM
Reposted by Ana Marasović
Very honored to be one out of seven outstanding papers at this years' EMNLP :)

Huge thanks to my amazing collaborators @fatemehc.bsky.social @anamarasovic.bsky.social @boknilev.bsky.social , this would not have been possible without them!
November 7, 2025 at 8:58 AM
Check out Martin's talk at #EMNLP2025 today (Wed)!

If you care about CoT faithfulness, you 𝘮𝘶𝘴𝘵 read this paper. It introduces the first method for measuring CoT faithfulness that is not purely behavioral, but operates with the internals!
Flying out to @emnlpmeeting soon🇨🇳
I'll present our parametric CoT faithfulness work (arxiv.org/abs/2502.14829) on Wednesday at the second Interpretability session, 16:30-18:00 local time A104-105

If you're in Suzhou, reach out to talk all things reasoning :)
Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps
When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. Despite much work o...
arxiv.org
November 4, 2025 at 10:54 PM
Go check Alex's poster today (Wed) in Suzhou! #EMNLP2025

I'm still so proud of our work (led by @lasha.bsky.social) on CondaQA, so we had to ask what would happen if we tried to create high-quality reasoning-over-text benchmarks now that LLMs are available. Turns out, we'd make an easier benchmark!
I'll be in Suzhou 🇨🇳 at #EMNLP this week presenting "What has been Lost with Synthetic Evaluation?" done with @anamarasovic.bsky.social & @lasha.bsky.social! 🎉

📍Findings Session 1 - Hall C
📅 Wed, November 5, 13:00 - 14:00

arxiv.org/abs/2505.22830
November 4, 2025 at 10:44 PM
Reposted by Ana Marasović
Flying out to @emnlpmeeting soon🇨🇳
I'll present our parametric CoT faithfulness work (arxiv.org/abs/2502.14829) on Wednesday at the second Interpretability session, 16:30-18:00 local time A104-105

If you're in Suzhou, reach out to talk all things reasoning :)
Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps
When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. Despite much work o...
arxiv.org
October 31, 2025 at 1:30 PM
Reposted by Ana Marasović
Sort of starting to believe that we really do need academic metrics that punish publishing too much
November 3, 2025 at 10:38 PM
Reposted by Ana Marasović
I'll be in Suzhou 🇨🇳 at #EMNLP this week presenting "What has been Lost with Synthetic Evaluation?" done with @anamarasovic.bsky.social & @lasha.bsky.social! 🎉

📍Findings Session 1 - Hall C
📅 Wed, November 5, 13:00 - 14:00

arxiv.org/abs/2505.22830
November 3, 2025 at 11:03 AM
Reposted by Ana Marasović
🧠 Can large language models build the very benchmarks used to evaluate them?
In “What Has Been Lost with Synthetic Evaluation”, Ana Marasović (@anamarasovic.bsky.social) and collaborators ask what happens when LLMs start generating the datasets used to test their reasoning. (1/6🧵)
October 20, 2025 at 4:01 PM
Reposted by Ana Marasović
👉 Do large language models really reason the way their chain-of-thoughts suggest?
This week on #WiAIRpodcast, we talk with Ana Marasović (@anamarasovic.bsky.social) about her paper “Chain-of-Thought Unfaithfulness as Disguised Accuracy.” (1/6🧵)
📄 Paper: arxiv.org/pdf/2402.14897
October 15, 2025 at 4:06 PM
Reposted by Ana Marasović
Can you trust your reward model alignment scores?
New work presented today at the COLM Workshop on Socially Responsible Language Modelling Research led by Purbid Bambroo and in collaboration with @anamarasovic.bsky.social that probes LLM preference test sets for redundancy and inflated scores.

1/8
October 10, 2025 at 4:03 PM
Reposted by Ana Marasović
📣Tomorrow at #COLM2025:

1️⃣ Purbid's 𝐩𝐨𝐬𝐭𝐞𝐫 at 𝐒𝐨𝐋𝐚𝐑 (𝟏𝟏:𝟏𝟓𝐚𝐦-𝟏:𝟎𝟎𝐩𝐦) on catching redundant preference pairs & how pruning them hurts accuracy; www.anamarasovic.com/publications...

2️⃣ My 𝐭𝐚𝐥𝐤 at 𝐗𝐋𝐋𝐌-𝐑𝐞𝐚𝐬𝐨𝐧-𝐏𝐥𝐚𝐧 (𝟏𝟐𝐩𝐦) on measuring CoT faithfulness by looking at internals, not just behaviorally

1/3
October 9, 2025 at 4:54 PM
📣Tomorrow at #COLM2025:

1️⃣ Purbid's 𝐩𝐨𝐬𝐭𝐞𝐫 at 𝐒𝐨𝐋𝐚𝐑 (𝟏𝟏:𝟏𝟓𝐚𝐦-𝟏:𝟎𝟎𝐩𝐦) on catching redundant preference pairs & how pruning them hurts accuracy; www.anamarasovic.com/publications...

2️⃣ My 𝐭𝐚𝐥𝐤 at 𝐗𝐋𝐋𝐌-𝐑𝐞𝐚𝐬𝐨𝐧-𝐏𝐥𝐚𝐧 (𝟏𝟐𝐩𝐦) on measuring CoT faithfulness by looking at internals, not just behaviorally

1/3
October 9, 2025 at 4:54 PM
Sad: Can't go to CoLM because of immigration.

Happy: Well, at least I can mountain bike during the fall break in prime SLC MTB weather.

Sad: Comes down with a cold.

☹️☹️☹️☹️☹️☹️
October 8, 2025 at 11:22 PM
I had a great time chatting with Jekaterina and Malikeh. This episode is like a tour of all the things I've been studying lately!
🎙️ New Women in AI Research episode out now!
This time, we sit down with @anamarasovic.bsky.social to unpack some of the toughest questions in AI explainability and trust.

🔗 Watch here → youtu.be/xYb6uokKKOo
youtu.be
October 8, 2025 at 5:13 PM
Reposted by Ana Marasović
🎙️ New Women in AI Research episode out now!
This time, we sit down with @anamarasovic.bsky.social to unpack some of the toughest questions in AI explainability and trust.

🔗 Watch here → youtu.be/xYb6uokKKOo
youtu.be
October 8, 2025 at 4:03 PM
Happening today! #COLM2025
Honored 🎷🎸🥁 𝗠𝗶𝘅𝗔𝘀𝘀𝗶𝘀𝘁 🎷🎸🥁 was selected as the #COLM2025 oral spotlight. Go check out @mclemcrew.bsky.social's 𝐭𝐚𝐥𝐤 on 𝐖𝐞𝐝 (𝐎𝐜𝐭 𝟖) at 𝟑:𝟑𝟎𝐩𝐦 in 517BC and 𝐩𝐨𝐬𝐭𝐞𝐫 from 𝟒:𝟑𝟎-𝟓:𝟑𝟎 in 710!
Mega stoked to attend #COLM25 this week and present our work, MixAssist, on Wednesday!

@anamarasovic.bsky.social sadly can't make it 😭, but hit me up if you'd like to chat about audio language models, music mixing, or anything else regarding music and audio!
October 8, 2025 at 3:32 PM