BlackboxNLP
banner
blackboxnlp.bsky.social
BlackboxNLP
@blackboxnlp.bsky.social
The largest workshop on analysing and interpreting neural networks for NLP.

BlackboxNLP will be held at EMNLP 2025 in Suzhou, China

blackboxnlp.github.io
Nicolò & Mingyang: Can we understand which circuits emerge in small models and reasoning-tuned systems, and how do they compare with default systems? Are there methods that generalize better across all tasks?
November 9, 2025 at 7:23 AM
Q: What's next for interpretability benchmarks? Michal: People sitting together and planning how to extend tests to multimodal, diverse contexts. @michaelwhanna.bsky.social: For circuit finding, integrating sparse features circuits could help us better understand our models.
November 9, 2025 at 7:21 AM
Nicolò & Mingyang: Starting to explore notebooks and public libraries can be very helpful in gaining early intuitions about what's promising.
November 9, 2025 at 7:16 AM
@michaelwhanna.bsky.social: Don't try to read everything. Find Qs you really care about, and go a level deeper to answer meaningful questions.
November 9, 2025 at 7:15 AM
Q: How would one go about approaching interpretability research these days? Michal: "When things don't work out of the box, it's a sign to double down and find out why. Negative results are important!"
November 9, 2025 at 7:15 AM
@danaarad.bsky.social: As deep learning research converges on similar architectures for different modalities, it will be interesting to determine which interpretability method will remain useful across various models and tasks.
November 9, 2025 at 7:15 AM
@michaelwhanna.bsky.social, Nicolò & Mingyang: Counterfactuals in minimal settings can be helpful, but they do not capture the whole story. Extending current methods to long contexts, and finding practical applications in safety-related areas are exciting challenges ahead.
November 9, 2025 at 7:07 AM
Michal: Mechanistic interpretability has heavily focused on toy tasks and text-only models. The next step is scaling to more complex tasks that involve real-world reasoning.
November 9, 2025 at 7:07 AM
Reposted by BlackboxNLP
I'll be presenting this work at @blackboxnlp.bsky.social in Suzhou, happy to chat there or here if you are interested !
October 22, 2025 at 8:16 AM
Reposted by BlackboxNLP
Nov 9, @blackboxnlp.bsky.social , 11:00-12:00 @ Hall C – Interpreting Language Models Through Concept Descriptions: A Survey (Feldhus & Kopf) @lkopf.bsky.social

🗞️ aclanthology.org/2025.blackbo...

bsky.app/profile/nfel...
November 6, 2025 at 7:00 AM
Results + technical report deadline: August 8, 2025
Full task details: blackboxnlp.github.io/2025/task/
BlackboxNLP 2025
The Eight Workshop on Analyzing and Interpreting Neural Networks for NLP
blackboxnlp.github.io
July 30, 2025 at 5:57 AM