Sagnik Mukherjee
banner
sagnikmukherjee.bsky.social
Sagnik Mukherjee
@sagnikmukherjee.bsky.social
NLP PhD student @convai_uiuc | Agents, Reasoning, evaluation etc.
https://sagnikmukherjee.github.io

https://scholar.google.com/citations?user=v4lvWXoAAAAJ&hl=en
🧵[7/n]

🔍 Potential Reasons

💡 We hypothesize that the in-distribution nature of training data is a key driver behind this sparsity
🧠 The model already "knows" a lot — RL just fine-tunes a small, relevant subnetwork rather than overhauling everything
May 21, 2025 at 3:50 AM
🧵[6/n]

🌐 The Subnetwork Is General
🔁 Subnetworks trained with different seed, datasets, or even algorithms show nontrivial overlap
🧩 Suggests the subnetwork is a generalizable structure tied to the base model
🧠 A shared backbone seems to emerge, no matter how you train it
May 21, 2025 at 3:50 AM
🧵[5/n]📊
🧪 Training the Subnetwork Reproduces Full Model

1️⃣ When trained in isolation, the sparse subnetwork recovers almost the exact same weights as the full model
2️⃣ achieves comparable (or better) end-task performance
3️⃣ 🧮 Even the training loss converges more smoothly
May 21, 2025 at 3:50 AM
🧵[4/n]

📚 Each Layer Is Equally Sparse (or Dense)

📏 No specific layer or sublayer gets special treatment — all layers are updated equally sparsely.
🎯Despite the sparsity, the updates are still full-rank
May 21, 2025 at 3:50 AM
🧵[3/n]

📉 Even Gradients Are Sparse in RL 📉

🧠 In PRIME, 72% of parameters never receive any gradient — ever!
↔️ Some do, but their gradients cancel out over time.
🎯 It’s not just sparse updates, even the gradients are sparse
May 21, 2025 at 3:50 AM
🧵[2/n]

💡 SFT Updates Are Dense 💡
Unlike RL, Supervised Fine-Tuning (SFT) updates are much denser 🧠
📊 Sparsity is low — at most only 15.31% of parameters remain untouched.
May 21, 2025 at 3:50 AM
May 21, 2025 at 3:50 AM
🚨 Paper Alert: “RL Finetunes Small Subnetworks in Large Language Models”

From DeepSeek V3 Base to DeepSeek R1 Zero, a whopping 86% of parameters were NOT updated during RL training 😮😮
And this isn’t a one-off. The pattern holds across RL algorithms and models.
🧵A Deep Dive
May 21, 2025 at 3:50 AM
📈 Our results:
PARC improves error detection accuracy by 6-16%, enabling more reliable step-level verification in mathematical reasoning chains.

🧵[5/n]
May 7, 2025 at 6:52 PM
📊 The exciting part?
LLMs can reliably identify these critical premises - the specific prior statements that directly support each reasoning step. This creates a transparent structure showing exactly which information is necessary for each conclusion.

🧵[4/n]
May 7, 2025 at 6:52 PM
🚀Our ICML 2025 paper introduces "Premise-Augmented Reasoning Chains" - a structured approach to induce explicit dependencies in reasoning chains.

By revealing the dependencies within chains, we significantly improve how LLM reasoning can be verified.

🧵[1/n]
May 7, 2025 at 6:52 PM
🚩 We discovered some key gaps: Incomplete cultural coverage, issues with methodological robustness, and a lack of situated studies for real-world applicability. These gaps limit our understanding of cultural biases in LLMs. [5/7]
November 21, 2024 at 10:03 PM
📚 Most studies use black-box probing methods to examine LLMs' cultural biases. However, these methods can be sensitive to prompt wording, raising concerns about robustness and generalizability. [4/7]
November 21, 2024 at 10:03 PM
Moreover, following Hershcovich et al. (2022), we examined the Linguistic-Cultural Interaction in current Cultural LLM research. Notably, none of the papers we reviewed address the concept of "Aboutness." [3/7]
November 21, 2024 at 10:03 PM
📊 We categorize cultural proxies into demographic proxies (such as region, language, ethnicity) and semantic proxies (such as values, norms, food habits). Current research mainly explores values and norms, leaving many cultural aspects unexplored. [2/7]
November 21, 2024 at 10:03 PM
📢📢LLMs are biased towards Western Culture. Well, okay, but what do you mean by "Culture"?
In our survey of on cultural bias in LLMs, we reviewed ~90 papers. Interestingly, none of these papers define "culture" explicitly. They use “proxies”. [1/7]
[Appeared in EMNLP mains]
November 21, 2024 at 10:03 PM