Martin Tutek
mtutek.bsky.social
Martin Tutek
@mtutek.bsky.social
Postdoc @ TakeLab, UniZG | previously: Technion; TU Darmstadt | PhD @ TakeLab, UniZG

Faithful explainability, controllability & safety of LLMs.

🔎 On the academic job market 🔎

https://mttk.github.io/
Pinned
🚨🚨 New preprint 🚨🚨

Ever wonder whether verbalized CoTs correspond to the internal reasoning process of the model?

We propose a novel parametric faithfulness approach, which erases information contained in CoT steps from the model parameters to assess CoT faithfulness.

arxiv.org/abs/2502.14829
Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps
When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. However, despite mu...
arxiv.org
Reposted by Martin Tutek
Nathan Stringham, Fateme Hashemi Chaleshtori, Xinyuan Yan, Zhichao Xu, Bei Wang, Ana Marasovi\'c
Teaching People LLM's Errors and Getting it Right
https://arxiv.org/abs/2512.21422
December 29, 2025 at 7:45 AM
Reposted by Martin Tutek
🚀 We’re opening 2 fully funded postdoc positions in #NLP!

Join the MilaNLP team and contribute to our upcoming research projects.

🔗 More details: milanlproc.github.io/open_positio...

⏰ Deadline: Jan 31, 2026
December 18, 2025 at 3:29 PM
Reposted by Martin Tutek
COLM 2026 is just around the corner! Mark your calendars for:

💡 Abstract deadline: Thursday, March 26, 2026
📄 Full paper submission deadline: Tuesday, March 31, 2026

Call for papers (website coming soon):
docs.google.com/document/d/1...
December 16, 2025 at 3:31 PM
Reposted by Martin Tutek
At the #Neurips2025 mechanistic interpretability workshop I gave a brief talk about Venetian glassmaking, since I think we face a similar moment in AI research today.

Here is a blog post summarizing the talk:

davidbau.com/archives/202...
December 11, 2025 at 3:03 PM
Reposted by Martin Tutek
I’m recruiting a postdoc to work on algorithms for cancer genome reconstruction. We have access to a rich set of tumour samples sequenced across multiple technologies. If interested, feel free to DM. Please share.
December 11, 2025 at 3:04 AM
Reposted by Martin Tutek
🧑‍🔬I’m recruiting PhD students in Natural Language Processing @unileipzig.bsky.social Computer Science, together with @scadsai.bsky.social!

Topics include, but aren’t limited to:

🔎Linguistic Interpretability
🌍Multilingual Evaluation
📖Computational Typology

Please share!

#NLProc #NLP
December 11, 2025 at 1:36 PM
Reposted by Martin Tutek
I will be @euripsconf.bsky.social this week to present our paper as non-archival at the PAIG workshop (Beyong Regulation:
Private Governance & Oversight Mechanisms for AI). Very much looking forward to the discussions!

If you are at #EurIPS and want to chat about LLM's training data. Reach out!
📣 New Preprint!
Have you ever wondered what the political content in LLM's training data is? What are the political opinions expressed? What is the proportion of left- vs right-leaning documents in the pre- and post-training data? Do they correlate with the political biases reflected in models?
December 2, 2025 at 9:47 PM
Reposted by Martin Tutek
📢 Postdoc position 📢

I’m recruiting a postdoc for my lab at NYU! Topics include LM reasoning, creativity, limitations of scaling, AI for science, & more! Apply by Feb 1.

(Different from NYU Faculty Fellows, which are also great but less connected to my lab.)

Link in 🧵
December 2, 2025 at 4:04 PM
Reposted by Martin Tutek
Fifteen Years

xkcd.com/3172/
November 26, 2025 at 10:32 PM
Reposted by Martin Tutek
There's a reviewer at ICLR who apparently always writes *exactly* 40 weaknesses and comments no matter what paper he's reviewing.

Exhibit A: openreview.net/forum?id=8qk...
Exhibit B: openreview.net/forum?id=GlX...
Exhibit C: openreview.net/forum?id=kDh...
November 15, 2025 at 2:42 PM
*Urgently* looking for emergency reviewers for the ARR October Interpretability track 🙏🙏

ReSkies much appreciated
November 11, 2025 at 10:29 AM
Reposted by Martin Tutek
Full house at BlackboxNLP at #EMNLP2025!! Getting ready for my 1.45PM keynote 😎 Join us in A102 to learn about "Memorization: myth or mystery?"
November 9, 2025 at 3:05 AM
Reposted by Martin Tutek
𝙒𝙚'𝙧𝙚 𝙝𝙞𝙧𝙞𝙣𝙜 𝙣𝙚𝙬 𝙛𝙖𝙘𝙪𝙡𝙩𝙮 𝙢𝙚𝙢𝙗𝙚𝙧𝙨!

KSoC: utah.peopleadmin.com/postings/190... (AI broadly)

Education + AI:
- utah.peopleadmin.com/postings/189...
- utah.peopleadmin.com/postings/190...

Computer Vision:
- utah.peopleadmin.com/postings/183...
November 7, 2025 at 11:35 PM
Reposted by Martin Tutek
Outstanding paper (5/7):

"Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps"
by Martin Tutek, Fateme Hashemi Chaleshtori, Ana Marasovic, and Yonatan Belinkov
aclanthology.org/2025.emnlp-m...

6/n
Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps
Martin Tutek, Fateme Hashemi Chaleshtori, Ana Marasovic, Yonatan Belinkov. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.
aclanthology.org
November 7, 2025 at 10:32 PM
Very honored to be one out of seven outstanding papers at this years' EMNLP :)

Huge thanks to my amazing collaborators @fatemehc.bsky.social @anamarasovic.bsky.social @boknilev.bsky.social , this would not have been possible without them!
November 7, 2025 at 8:58 AM
Reposted by Martin Tutek
Presenting today our work "Unsupervised Word-level Quality Estimation Through the Lens of Annotator (Dis)agreement" at the Machine Translation morning session (Room A301, 11:45 China time). See you there! 🤗

Paper: aclanthology.org/2025.emnlp-m...
Slides/video/poster: underline.io/events/502/s...
Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement
Gabriele Sarti, Vilém Zouhar, Malvina Nissim, Arianna Bisazza. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025.
aclanthology.org
November 6, 2025 at 1:19 AM
Reposted by Martin Tutek
Here’s a custom feed for #EMNLP2025. Click the pin to save it to your home screen!
November 2, 2025 at 3:15 PM
Flying out to @emnlpmeeting soon🇨🇳
I'll present our parametric CoT faithfulness work (arxiv.org/abs/2502.14829) on Wednesday at the second Interpretability session, 16:30-18:00 local time A104-105

If you're in Suzhou, reach out to talk all things reasoning :)
Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps
When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. Despite much work o...
arxiv.org
October 31, 2025 at 1:30 PM
Reposted by Martin Tutek
⏰ One week left to apply for the two PhD Fellowships in Trustworthy NLP and Explainable NLU! The two positions have a starting date in spring 2026. Check the original post for more details👇
Available #NLProc PhD positions:
- Explainable NLU, main supervisor: myself, start in Spring 2026 tinyurl.com/3uset3dm
- Trustworthy NLP, main supervisor: @apepa.bsky.social, start in Spring 2026 tinyurl.com/yxj8yk4m
- Open-topic: express interest via ELLIS, start in Autumn 2026 tinyurl.com/2hcxexyx
LinkedIn
This link will take you to a page that’s not on LinkedIn
lnkd.in
October 24, 2025 at 8:30 AM
Reposted by Martin Tutek
📣Tomorrow at #COLM2025:

1️⃣ Purbid's 𝐩𝐨𝐬𝐭𝐞𝐫 at 𝐒𝐨𝐋𝐚𝐑 (𝟏𝟏:𝟏𝟓𝐚𝐦-𝟏:𝟎𝟎𝐩𝐦) on catching redundant preference pairs & how pruning them hurts accuracy; www.anamarasovic.com/publications...

2️⃣ My 𝐭𝐚𝐥𝐤 at 𝐗𝐋𝐋𝐌-𝐑𝐞𝐚𝐬𝐨𝐧-𝐏𝐥𝐚𝐧 (𝟏𝟐𝐩𝐦) on measuring CoT faithfulness by looking at internals, not just behaviorally

1/3
October 9, 2025 at 4:54 PM
If you're at COLM, check out various works by Ana and her group!
📣Tomorrow at #COLM2025:

1️⃣ Purbid's 𝐩𝐨𝐬𝐭𝐞𝐫 at 𝐒𝐨𝐋𝐚𝐑 (𝟏𝟏:𝟏𝟓𝐚𝐦-𝟏:𝟎𝟎𝐩𝐦) on catching redundant preference pairs & how pruning them hurts accuracy; www.anamarasovic.com/publications...

2️⃣ My 𝐭𝐚𝐥𝐤 at 𝐗𝐋𝐋𝐌-𝐑𝐞𝐚𝐬𝐨𝐧-𝐏𝐥𝐚𝐧 (𝟏𝟐𝐩𝐦) on measuring CoT faithfulness by looking at internals, not just behaviorally

1/3
October 9, 2025 at 4:58 PM
🤔What happens when LLM agents choose between achieving their goals and avoiding harm to humans in realistic management scenarios? Are LLMs pragmatic or prefer to avoid human harm?

🚀 New paper out: ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs🚀🧵
October 8, 2025 at 3:14 PM
I won't be at COLM, so come see Yonatan talk about our work on estimating CoT faithfulness using machine unlearning!

Check out the thread for the (many) other interesting works from his group 🎉
In #Interplay25 workshop, Friday ~11:30, I'll present on measuring *parametric* CoT faithfulness on behalf of
@mtutek.bsky.social , who couldn't travel:
bsky.app/profile/mtut...

Later that day we'll have a poster on predicting success of model editing by Yanay Soker, who also couldn't travel
October 7, 2025 at 1:47 PM
Reposted by Martin Tutek
Here’s a #COLM2025 feed!

Pin it 📌 to follow along with the conference this week!
October 6, 2025 at 8:26 PM
Reposted by Martin Tutek
Josip Juki\'c, Martin Tutek, Jan \v{S}najder
Context Parametrization with Compositional Adapters
https://arxiv.org/abs/2509.22158
September 29, 2025 at 7:47 AM