https://stellalisy.com
We built the first unified taxonomy of 28 cognitive elements underlying reasoning
Spoiler—LLMs commonly employ sequential reasoning, rarely self-awareness, and often fail to use correct reasoning structures🧠
📄Paper: arxiv.org/abs/2511.16660
💻Code: github.com/pkargupta/co...
🤗Data: huggingface.co/collections/...
🌐 Blogpost: tinyurl.com/cognitive-fo...
📄Paper: arxiv.org/abs/2511.16660
💻Code: github.com/pkargupta/co...
🤗Data: huggingface.co/collections/...
🌐 Blogpost: tinyurl.com/cognitive-fo...
🔍 Systematic diagnosis of reasoning failures
🎯 Predicting which training→which capabilities
🧪 Testing cognitive theories at scale
🌉 Shared vocabulary bridging cognition & AI research
More on opportunities & challenges in📄
🔍 Systematic diagnosis of reasoning failures
🎯 Predicting which training→which capabilities
🧪 Testing cognitive theories at scale
🌉 Shared vocabulary bridging cognition & AI research
More on opportunities & challenges in📄
We scaffold cognitive structures from successful traces to guide reasoning.
Major gains on ill-structured problems🌟
Models possess latent capabilities—they just don't deploy them adaptively without explicit guidance.
We scaffold cognitive structures from successful traces to guide reasoning.
Major gains on ill-structured problems🌟
Models possess latent capabilities—they just don't deploy them adaptively without explicit guidance.
Even with correct answers—underlying mechanisms diverge fundamentally.
Even with correct answers—underlying mechanisms diverge fundamentally.
Research concentrates on easily quantifiable behaviors—sequential organization (55%), decomposition (60%)
Neglects meta-cognitive controls (8-16%) and alternative representations (10-27%) that correlate with success⚠️
Research concentrates on easily quantifiable behaviors—sequential organization (55%), decomposition (60%)
Neglects meta-cognitive controls (8-16%) and alternative representations (10-27%) that correlate with success⚠️
We introduce method to extract reasoning structure from traces
Successful: selective attention → knowledge alignment → forward chaining
Common: skip to forward chaining
LLMs prematurely seek solution before understanding constraints‼️
We introduce method to extract reasoning structure from traces
Successful: selective attention → knowledge alignment → forward chaining
Common: skip to forward chaining
LLMs prematurely seek solution before understanding constraints‼️
Olmo3 exhibits more diverse cognitive elements (49%)—they explicitly included meta-reasoning data during midtraining.
DeepHermes-3: only 12% avg presence.
Training methodology shapes cognitive profiles dramatically.
Olmo3 exhibits more diverse cognitive elements (49%)—they explicitly included meta-reasoning data during midtraining.
DeepHermes-3: only 12% avg presence.
Training methodology shapes cognitive profiles dramatically.
🤔Self-awareness: 16% in research design, 19% in LLM traces vs 49% in humans
🧐Self-evaluation on non-verifiable problems collapses (53.5% presence, 0.031 correlation)
Models can't self-assess without ground truth.
🤔Self-awareness: 16% in research design, 19% in LLM traces vs 49% in humans
🧐Self-evaluation on non-verifiable problems collapses (53.5% presence, 0.031 correlation)
Models can't self-assess without ground truth.
Logical coherence: 91% of traces, 0.091 corr. w/ success Knowledge alignment: 20% of traces, 0.234 correlation (high)
Models frequently attempt core elements but fail to execute. Having the capability ≠ deploying it successfully😬
Logical coherence: 91% of traces, 0.091 corr. w/ success Knowledge alignment: 20% of traces, 0.234 correlation (high)
Models frequently attempt core elements but fail to execute. Having the capability ≠ deploying it successfully😬
As problems become ill-structured, models narrow their repertoire—but successful traces show need for greater diversity (successful = high ppmi in fig).
Sequential organization dominates. Meta-cognition disappears in LLMs.
As problems become ill-structured, models narrow their repertoire—but successful traces show need for greater diversity (successful = high ppmi in fig).
Sequential organization dominates. Meta-cognition disappears in LLMs.
We introduce a framework for fine-grained span-level cognitive evaluation: WHICH elements appear, WHERE, and HOW they're sequenced.
First analysis of its kind at this scale📊
We introduce a framework for fine-grained span-level cognitive evaluation: WHICH elements appear, WHERE, and HOW they're sequenced.
First analysis of its kind at this scale📊
28 elements across 4 dimensions—reasoning invariants (compositionality, logical coherence), meta-cognitive controls (self-awareness), representations (hierarchical, causal), and operations (backtracking, verification)
28 elements across 4 dimensions—reasoning invariants (compositionality, logical coherence), meta-cognitive controls (self-awareness), representations (hierarchical, causal), and operations (backtracking, verification)
The issue: reasoning evaluations is by outcomes w/o understanding the cognitive processes that produce them. We can't diagnose failures or predict how training produces capabilities🚨
The issue: reasoning evaluations is by outcomes w/o understanding the cognitive processes that produce them. We can't diagnose failures or predict how training produces capabilities🚨
We built the first unified taxonomy of 28 cognitive elements underlying reasoning
Spoiler—LLMs commonly employ sequential reasoning, rarely self-awareness, and often fail to use correct reasoning structures🧠
We built the first unified taxonomy of 28 cognitive elements underlying reasoning
Spoiler—LLMs commonly employ sequential reasoning, rarely self-awareness, and often fail to use correct reasoning structures🧠
Poster #77: ALFA: Aligning LLMs to Ask Good Questions: A Case Study in Clinical Reasoning; led by
@stellali.bsky.social & @jiminmun.bsky.social
Poster #77: ALFA: Aligning LLMs to Ask Good Questions: A Case Study in Clinical Reasoning; led by
@stellali.bsky.social & @jiminmun.bsky.social
📖Paper: arxiv.org/abs/2507.13541
💻Code: github.com/stellalisy/P...
Join us in shaping interpretable AI that you can trust and control🚀Feedback welcome!
#AI #Transparency
📖Paper: arxiv.org/abs/2507.13541
💻Code: github.com/stellalisy/P...
Join us in shaping interpretable AI that you can trust and control🚀Feedback welcome!
#AI #Transparency
📊 Quantify community values at scale
📈 Track how norms evolve over time
🔍 Understand group psychology
📋 Move beyond surveys to revealed preferences
📊 Quantify community values at scale
📈 Track how norms evolve over time
🔍 Understand group psychology
📋 Move beyond surveys to revealed preferences
🛡️Smart content moderation—explains why content is flagged/decisions are made
🎯Interpretable LM alignment—revealing prominent attributes
⚙️Controllable personalization—giving user agency to personalize select attributes
🛡️Smart content moderation—explains why content is flagged/decisions are made
🎯Interpretable LM alignment—revealing prominent attributes
⚙️Controllable personalization—giving user agency to personalize select attributes
r/AskHistorians:📚values verbosity
r/RoastMe:💥values directness
r/confession:❤️values empathy
We visualize each group’s unique preference decisions—no more one-size-fits-all. Understand your audience at a glance🏷️
r/AskHistorians:📚values verbosity
r/RoastMe:💥values directness
r/confession:❤️values empathy
We visualize each group’s unique preference decisions—no more one-size-fits-all. Understand your audience at a glance🏷️
📈Performance boost: +46.6% vs GPT-4o
💪Outperforms other training-based baselines w/ statistical significance
🕰️Robust to temporal shifts—trained pref models can be used out-of-the box!
📈Performance boost: +46.6% vs GPT-4o
💪Outperforms other training-based baselines w/ statistical significance
🕰️Robust to temporal shifts—trained pref models can be used out-of-the box!
1: 🎛️Train compact, efficient detectors for every attribute
2: 🎯Learn community-specific attribute weights during preference training
3: 🔧Add attribute embeddings to preference model for accurate & explainable predictions
1: 🎛️Train compact, efficient detectors for every attribute
2: 🎯Learn community-specific attribute weights during preference training
3: 🔧Add attribute embeddings to preference model for accurate & explainable predictions
📜Define 19 sociolinguistics & cultural attributes from literature
🏭Novel preference data generation pipeline to isolate attributes
Our data gen pipeline generates pairwise data on *any* decomposed dimension, w/ applications beyond preference modeling
📜Define 19 sociolinguistics & cultural attributes from literature
🏭Novel preference data generation pipeline to isolate attributes
Our data gen pipeline generates pairwise data on *any* decomposed dimension, w/ applications beyond preference modeling