Zach Studdiford
zstuddiford.bsky.social
Zach Studdiford
@zstuddiford.bsky.social
Undergrad researcher
@UWMadison.bsky.social | Language and cognition | Representational alignment in LLMs
Thanks to @siddsuresh97.bsky.social , @kushinm.bsky.social , and Tim Rogers for their support on this work!
I’m excited to be applying to grad school this cycle so reach out if you want to chat about this
Code 💻: github.com/Knowledge-an...
Paper 📄: arxiv.org/abs/2510.01030
10/10
GitHub - Knowledge-and-Concepts-Lab/llm-alignment-benchmark: Measuring alignment of transformer models for different architecture/param size/training data/RL
Measuring alignment of transformer models for different architecture/param size/training data/RL - Knowledge-and-Concepts-Lab/llm-alignment-benchmark
github.com
October 21, 2025 at 4:55 PM
Further, we think that this project really showcases how a mix of hard earned data from the cognitive sciences (THINGS), methods for inferring reps from behavior, and open model sharing platforms (Huggingface) can come together to answer the big questions! 9/10
October 21, 2025 at 4:55 PM
We really think measuring human-model representation alignment is a necessary complement to purely evaluating how human-like model behaviors are. We want models that are aligned at multiple levels of analysis! Read arxiv.org/abs/2310.13018 for a great primer on the topic. 8/10
Getting aligned on representational alignment
Biological and artificial information processing systems form representations of the world that they can use to categorize, reason, plan, navigate, and make decisions. How can we measure the similarit...
arxiv.org
October 21, 2025 at 4:55 PM
That’s not all! We found that popular benchmarks (e.g., BigBench-Hard, MMLU) correlate somewhat with human-model alignment, but leave much variance unexplained. So, you won’t get to human-like models by pure benchmark climbing. 7/10
October 21, 2025 at 4:55 PM
Model size and the addition of multimodal (vision-language) capabilities had no effect on alignment when controlling for variance explained by other factors. This challenges popular notions of ‘scale is all you need’. 6/10
October 21, 2025 at 4:55 PM
So, what did we find? Most notably, Instruction fine-tuning and larger dimensionality (for MLPs, hidden layers, and context lengths) supported higher human-model alignment of representations. 5/10
October 21, 2025 at 4:55 PM
Using a subset of concepts from @martin_hebart et al.'s THINGS dataset as the key target for our experiments, we had models complete the triplet judgment task and estimated semantic embeddings based on human and model judgments. 4/10
October 21, 2025 at 4:55 PM
Next, we needed models that sufficiently vary in key computational ingredients. We thus constructed a suite of over 70 open-weight models that vary along several parameters of interest including model scale, architectures, embedding size, degree of post training, and more! 3/10
October 21, 2025 at 4:55 PM
First, we needed a task to evaluate both humans and models in a relatively equitable way.
We chose a triplet similarity judgment task that has been shown to effectively capture the representational bases of human semantic knowledge. 2/10
October 21, 2025 at 4:55 PM