Lightnews — Scholar-powered news

Andrea de Varda

@andreadevarda.bsky.social

170 followers 290 following 21 posts

Postdoc at MIT BCS, interested in language(s) in humans and LMs

https://andrea-de-varda.github.io/

Posts Replies Media Videos

Andrea de Varda

@andreadevarda.bsky.social

Token count also captures differences across tasks. Avg. token count predicts avg. RT across domains (r = 0.97, left), and even item-level RTs across all tasks (r = 0.92 (!!), right). (5/6)

November 19, 2025 at 8:14 PM

Andrea de Varda

@andreadevarda.bsky.social

We found that the number of reasoning tokens generated by the model reliably correlates with human RTs within each task (mean r = 0.57, all ps < .001). (4/6)

November 19, 2025 at 8:14 PM

Andrea de Varda

@andreadevarda.bsky.social

Large reasoning models can solve many reasoning problems, but do their computations reflect how humans think?
We compared human RTs to DeepSeek-R1’s CoT length across seven tasks: arithmetic (numeric & verbal), logic (syllogisms & ALE), relational reasoning, intuitive reasoning, and ARC (3/6)

November 19, 2025 at 8:14 PM

Andrea de Varda

@andreadevarda.bsky.social

Encoding models trained on existing fMRI datasets successfully predicted responses in new languages, generalizing across stimuli types and modalities (11/)

February 4, 2025 at 6:03 PM

Andrea de Varda

@andreadevarda.bsky.social

In the “across” condition, performance improves for models with stronger cross-lingual semantic alignment (where translations cluster together in the embedding space) (9/)

February 4, 2025 at 6:03 PM

Andrea de Varda

@andreadevarda.bsky.social

But what kind of model properties influence LM-to-brain alignment across languages?

In the “within” condition, encoding performance is highest for models with good next-word prediction abilities (8/)

February 4, 2025 at 6:03 PM

Andrea de Varda

@andreadevarda.bsky.social

We also replicated in a cross-lingual setting the finding that the best fit to brain responses is obtained in intermediate-to-deep layers (for each subplot pair, the left one is “within”, the right one “across”) (7/)

February 4, 2025 at 6:03 PM

Andrea de Varda

@andreadevarda.bsky.social

We evaluated 20 multilingual LMs with different architectures and training objectives, and all of them were able to predict brain responses in the various languages (“within”) and critically, generalized zero-shot to unseen languages (“across”) (6/)

February 4, 2025 at 6:03 PM

Andrea de Varda

@andreadevarda.bsky.social

Critically, we fit two kinds of encoding models:
1️⃣ “within” encoding models, training and testing on data from a single language with cross-validation
2️⃣ “across” encoding models, training in N-1 languages and testing in the left-out language (5/)

February 4, 2025 at 6:03 PM

Andrea de Varda

@andreadevarda.bsky.social

In Study I, we:
1️⃣ Present participants with auditory passages and record their brain responses in the language network
2️⃣ Extract contextualized word embeddings from multilingual LMs
3️⃣ Fit encoding models predicting brain activity from the embeddings (4/)

February 4, 2025 at 6:03 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news