Nishant Subramani @ ACL
nsubramani23.bsky.social
Nishant Subramani @ ACL
@nsubramani23.bsky.social
PhD student @CMU LTI - working on model #interpretability, student researcher @google; prev predoc @ai2; intern @MSFT
nishantsubramani.github.io
Pinned
👏🏽 Intro

💼 PhD student @ltiatcmu.bsky.social

📜 My research is in model interpretability 🔎, understanding the internals of LLMs to build more controllable and trustworthy systems

🫵🏽 If you are interested in better understanding of language technology or model interpretability, let's connect!
Reposted by Nishant Subramani @ ACL
We discovered that language models leave a natural "signature" on their API outputs that's extremely hard to fake. Here's how it works 🔍

📄 arxiv.org/abs/2510.14086 1/
Every Language Model Has a Forgery-Resistant Signature
The ubiquity of closed-weight language models with public-facing APIs has generated interest in forensic methods, both for extracting hidden model details (e.g., parameters) and for identifying...
arxiv.org
October 17, 2025 at 5:59 PM
At @colmweb.org all week 🥯🍁! Presenting 3 mechinterp + actionable interp papers at @interplay-workshop.bsky.social

1. BERTology in the Modern World w/ @bearseascape.bsky.social
2. MICE for CATs
3. LLM Microscope w/ Jiarui Liu, Jivitesh Jain, @monadiab77.bsky.social

Reach out to chat! #COLM2025
October 6, 2025 at 10:08 PM
Excited to be attending NEMI in Boston today to present 🐁 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools and co-moderate the model steering and control roundtable! Come find me to connect and chat about steering and actionable interp
August 22, 2025 at 12:28 PM
At #ACL2025 in Vienna 🇦🇹 till next Saturday! Love to chat about anything #interpretability 🔎, understanding model internals 🔬, and finding yummy vegan food 🥬
July 25, 2025 at 9:53 PM
At #ICML2025 🇨🇦 till Sunday! Love to chat about #interpretability, understanding model internals, and finding yummy vegan food in Vancouver 🥬🍜
July 14, 2025 at 5:33 PM
Reposted by Nishant Subramani @ ACL
🚨New #interpretability paper with @nsubramani23.bsky.social: 🕵️ Model Internal Sleuthing: Finding Lexical Identity and Inflectional Morphology in Modern Language Models
June 4, 2025 at 5:19 PM
🚨 Check out our new #interpretability paper: 🕵🏽 Model Internal Sleuthing led by the amazing @bearseascape.bsky.social who is an undergrad at @scsatcmu.bsky.social @ltiatcmu.bsky.social
🚨New #interpretability paper with @nsubramani23.bsky.social: 🕵️ Model Internal Sleuthing: Finding Lexical Identity and Inflectional Morphology in Modern Language Models
June 4, 2025 at 5:41 PM
Excited to announce that I started at @googleresearch.bsky.social on the cloud team as a student researcher last month working with Hamid Palangi on actionable #interpretability 🔍 to build better tool using #agents ⚒️🤖
June 2, 2025 at 4:35 PM
Presenting this today at the poster session at #NAACL2025!

Come chat about interpretability, trustworthiness, and tool-using agents!

🗓️ - Thursday May 1st (today)
📍 - Hall 3
🕑 - 200-330pm
🚀 Excited to share a new interp+agents paper: 🐭🐱 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/🧵
May 1, 2025 at 3:28 PM
At #NAACL2025 🌵till Sunday! Love to chat about interpretability, understanding model internals, and finding vegan food 🥬
🚀 Excited to share a new interp+agents paper: 🐭🐱 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/🧵
April 30, 2025 at 3:03 PM
🚀 Excited to share a new interp+agents paper: 🐭🐱 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/🧵
April 29, 2025 at 1:41 PM
Reposted by Nishant Subramani @ ACL
Have these people met … society? Read a book? Listened to music? Regurgitating esoteric facts isn’t intelligence.

This is more like humanity’s last stand at jeopardy

www.nytimes.com/2025/01/23/t...
A Test So Hard No AI System Can Pass It — Yet
The creators of a new test called “Humanity’s Last Exam” argue we may soon lose the ability to create tests hard enough for A.I. models.
www.nytimes.com
January 25, 2025 at 6:15 PM
👏🏽 Intro

💼 PhD student @ltiatcmu.bsky.social

📜 My research is in model interpretability 🔎, understanding the internals of LLMs to build more controllable and trustworthy systems

🫵🏽 If you are interested in better understanding of language technology or model interpretability, let's connect!
December 10, 2024 at 3:53 PM
1) I'm working on using intermediate model generations with LLMs to better calibrate tool using agents ⚒️🤖 than the probabilities themselves! Turns out you can 🥳

2) There's gotta be a nice geometric understanding of what's going on within LLMs when we tune them 🤔
Bluesky academics, lets get to know each other! Quote this & tell me: 1) a project you are working on & 2) an odd idea/theory you aren’t working on but keep thinking about

1. I came to hate my work and thinking so don't do it anymore.
2.
Bluesky academics, lets get to know each other! Quote this & tell me: 1) a project you are working on & 2) an odd idea/theory you aren’t working on but keep thinking about

1. Convincing everyone that everything is luck, all the way down.

2. LLM’s can reason and understand in the external sense.
November 18, 2024 at 12:11 AM
Reposted by Nishant Subramani @ ACL
Utah is hiring tenure-track/tenured faculty & a priority area is NLP! 

Please reach out over email if you have questions about the school and Salt Lake City, happy to share my experience so far. 

utah.peopleadmin.com/postings/154...
October 27, 2023 at 5:48 PM