Lightnews — Scholar-powered news

Reposted by Nishant Subramani @ ACL

Matthew Finlayson

@mattf.nl

We discovered that language models leave a natural "signature" on their API outputs that's extremely hard to fake. Here's how it works 🔍

📄 arxiv.org/abs/2510.14086 1/

Every Language Model Has a Forgery-Resistant Signature

The ubiquity of closed-weight language models with public-facing APIs has generated interest in forensic methods, both for extracting hidden model details (e.g., parameters) and for identifying...

arxiv.org

October 17, 2025 at 5:59 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

At @colmweb.org all week 🥯🍁! Presenting 3 mechinterp + actionable interp papers at @interplay-workshop.bsky.social

1. BERTology in the Modern World w/ @bearseascape.bsky.social
2. MICE for CATs
3. LLM Microscope w/ Jiarui Liu, Jivitesh Jain, @monadiab77.bsky.social

Reach out to chat! #COLM2025

October 6, 2025 at 10:08 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

Excited to be attending NEMI in Boston today to present 🐁 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools and co-moderate the model steering and control roundtable! Come find me to connect and chat about steering and actionable interp

David Bau @davidbau.bsky.social · Aug 18

This Friday NEMI 2025 is at Northeastern in Boston, 8 talks, 24 roundtables, 90 posters; 200+ attendees. Thanks to
goodfire.ai/ for sponsoring! nemiconf.github.io/summer25/

If you can't make it in person, the livestream will be here:
www.youtube.com/live/4BJBis...

New England Mechanistic Interpretability Workshop

About:The New England Mechanistic Interpretability (NEMI) workshop aims to bring together academic and industry researchers from the New England and surround...

www.youtube.com

August 22, 2025 at 12:28 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

At #ACL2025 in Vienna 🇦🇹 till next Saturday! Love to chat about anything #interpretability 🔎, understanding model internals 🔬, and finding yummy vegan food 🥬

July 25, 2025 at 9:53 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

At #ICML2025 🇨🇦 till Sunday! Love to chat about #interpretability, understanding model internals, and finding yummy vegan food in Vancouver 🥬🍜

July 14, 2025 at 5:33 PM

Reposted by Nishant Subramani @ ACL

bearseascape.bsky.social

@bearseascape.bsky.social

🚨New #interpretability paper with @nsubramani23.bsky.social: 🕵️ Model Internal Sleuthing: Finding Lexical Identity and Inflectional Morphology in Modern Language Models

June 4, 2025 at 5:19 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

🚨 Check out our new #interpretability paper: 🕵🏽 Model Internal Sleuthing led by the amazing @bearseascape.bsky.social who is an undergrad at @scsatcmu.bsky.social @ltiatcmu.bsky.social

bearseascape.bsky.social @bearseascape.bsky.social · Jun 4

🚨New #interpretability paper with @nsubramani23.bsky.social: 🕵️ Model Internal Sleuthing: Finding Lexical Identity and Inflectional Morphology in Modern Language Models

June 4, 2025 at 5:41 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

Excited to announce that I started at @googleresearch.bsky.social on the cloud team as a student researcher last month working with Hamid Palangi on actionable #interpretability 🔍 to build better tool using #agents ⚒️🤖

June 2, 2025 at 4:35 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

Presenting this today at the poster session at #NAACL2025!

Come chat about interpretability, trustworthiness, and tool-using agents!

🗓️ - Thursday May 1st (today)
📍 - Hall 3
🕑 - 200-330pm

Nishant Subramani @ ACL @nsubramani23.bsky.social · Apr 29

🚀 Excited to share a new interp+agents paper: 🐭🐱 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/🧵

May 1, 2025 at 3:28 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

At #NAACL2025 🌵till Sunday! Love to chat about interpretability, understanding model internals, and finding vegan food 🥬

Nishant Subramani @ ACL @nsubramani23.bsky.social · Apr 29

🚀 Excited to share a new interp+agents paper: 🐭🐱 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/🧵

April 30, 2025 at 3:03 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

🚀 Excited to share a new interp+agents paper: 🐭🐱 MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools appearing at #NAACL2025

This was work done @msftresearch.bsky.social last summer with Jason Eisner, Justin Svegliato, Ben Van Durme, Yu Su, and Sam Thomson

1/🧵

April 29, 2025 at 1:41 PM

Reposted by Nishant Subramani @ ACL

Rumman Chowdhury

@ruchowdh.bsky.social

Have these people met … society? Read a book? Listened to music? Regurgitating esoteric facts isn’t intelligence.

This is more like humanity’s last stand at jeopardy

www.nytimes.com/2025/01/23/t...

A Test So Hard No AI System Can Pass It — Yet

The creators of a new test called “Humanity’s Last Exam” argue we may soon lose the ability to create tests hard enough for A.I. models.

www.nytimes.com

January 25, 2025 at 6:15 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

👏🏽 Intro

💼 PhD student @ltiatcmu.bsky.social

📜 My research is in model interpretability 🔎, understanding the internals of LLMs to build more controllable and trustworthy systems

🫵🏽 If you are interested in better understanding of language technology or model interpretability, let's connect!

December 10, 2024 at 3:53 PM

Nishant Subramani @ ACL

@nsubramani23.bsky.social

1) I'm working on using intermediate model generations with LLMs to better calibrate tool using agents ⚒️🤖 than the probabilities themselves! Turns out you can 🥳

2) There's gotta be a nice geometric understanding of what's going on within LLMs when we tune them 🤔

lastpositivist.bsky.social @lastpositivist.bsky.social · Nov 17

Bluesky academics, lets get to know each other! Quote this & tell me: 1) a project you are working on & 2) an odd idea/theory you aren’t working on but keep thinking about

1. I came to hate my work and thinking so don't do it anymore.
2.

Embrace the Void @etvpod.bsky.social · Nov 17

Bluesky academics, lets get to know each other! Quote this & tell me: 1) a project you are working on & 2) an odd idea/theory you aren’t working on but keep thinking about

1. Convincing everyone that everything is luck, all the way down.

2. LLM’s can reason and understand in the external sense.

November 18, 2024 at 12:11 AM

Reposted by Nishant Subramani @ ACL

Ana Marasović

@anamarasovic.bsky.social

Utah is hiring tenure-track/tenured faculty & a priority area is NLP!

Please reach out over email if you have questions about the school and Salt Lake City, happy to share my experience so far.

utah.peopleadmin.com/postings/154...

October 27, 2023 at 5:48 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news