Patrick Kahardipraja
pkhdipraja.bsky.social
Patrick Kahardipraja
@pkhdipraja.bsky.social
PhD student @ Fraunhofer HHI. Interpretability, incremental NLP, and NLU. https://pkhdipraja.github.io/
We will be presenting the paper at #ACL2025NLP 🇦🇹. Feel free to stop by the poster to say hello!

📅 29/07 (Tue) 10:30-12:00
📍 Hall 4/5

#NLProc #interpretability #XAI #mechinterp #MLSky
July 16, 2025 at 1:26 PM
We supports multiple LLM providers and locally hosted LLMs. For more details, check out our paper! arxiv.org/abs/2502.16994. This project was led by @brunibrun.bsky.social, Aakriti Jain & @golimblevskaia.bsky.social, and helped by Thomas Wiegand, Wojciech Samek, @slapuschkin.bsky.social & me.
FADE: Why Bad Descriptions Happen to Good Features
Recent advances in mechanistic interpretability have highlighted the potential of automating interpretability pipelines in analyzing the latent representations within LLMs. While they may enhance our ...
arxiv.org
July 16, 2025 at 1:26 PM
FADE quantifies the causes of mismatch of feature-to-description alignment and highlights challenges of current methods, such as various failure modes, how SAE features are more difficult to describe compared to MLP, and interpretability of feature descriptions across layers.
July 16, 2025 at 1:26 PM
Thanks for sharing! We are looking into the works you suggested and plan to discuss them in the next revision of this paper :)
May 28, 2025 at 7:28 PM
Many thanks to my amazing co-authors: @reduanachtibat.bsky.social, Thomas Wiegand, Wojciech Samek, @slapuschkin.bsky.social !

#NLProc #interpretability #XAI #mechinterp #MLSky
May 26, 2025 at 4:01 PM
Building on the gained insights, we present a probe to track for knowledge provenance during inference and show where it is localized within the input prompt. Our attempt shows promising results, with >94% ROC AUC and >84% localization accuracy.

4/4
May 26, 2025 at 4:01 PM
We analyze how in-context heads can specialize to understand instructions (task heads) and retrieve relevant information (retrieval heads). Together with parametric heads, we investigate their causal roles by extracting function vectors or modifying their weights.

3/
May 26, 2025 at 4:01 PM
Using interpretability tools, we discover that heads important for RAG can be categorized into two: parametric heads that encode relational knowledge and in-context heads that are responsible for processing information in the prompt.

2/
May 26, 2025 at 4:01 PM
can you please add me? Thanks!
November 26, 2024 at 3:44 PM
Hi, would love to be added :)
November 19, 2024 at 9:28 AM