Hannah Small
@hsmall.bsky.social
5th year PhD student in Cognitive Science at Johns Hopkins, working with Leyla Isik
https://www.hannah-small.com/
https://www.hannah-small.com/
These findings highlight the importance of visual-semantic signals, above and beyond spoken language content, across cortex, even in the language network.
The code to replicate the analyses and figures is available here: github.com/Isik-lab/ubi...
8/8
The code to replicate the analyses and figures is available here: github.com/Isik-lab/ubi...
8/8
GitHub - Isik-lab/ubiquitous-vis: Code for paper 'Ubiquitous cortical sensitivity to visual information during naturalistic, audiovisual movie viewing'
Code for paper 'Ubiquitous cortical sensitivity to visual information during naturalistic, audiovisual movie viewing' - GitHub - Isik-lab/ubiquitous-vis: Code for paper 'Ubiquitous cor...
github.com
September 24, 2025 at 7:52 PM
These findings highlight the importance of visual-semantic signals, above and beyond spoken language content, across cortex, even in the language network.
The code to replicate the analyses and figures is available here: github.com/Isik-lab/ubi...
8/8
The code to replicate the analyses and figures is available here: github.com/Isik-lab/ubi...
8/8
Follow-up analyses showed that both social perception and language regions were best predicted by later vision model layers that map onto both high-level social semantic signals (valence, the presence of a social interaction, faces).
7/n
7/n
September 24, 2025 at 7:51 PM
Follow-up analyses showed that both social perception and language regions were best predicted by later vision model layers that map onto both high-level social semantic signals (valence, the presence of a social interaction, faces).
7/n
7/n
Importantly, vision and language embeddings are only weakly correlated throughout the movie, suggesting that the vision and language embeddings are each predicting distinct variance in the neural responses.
6/n
6/n
September 24, 2025 at 7:51 PM
Importantly, vision and language embeddings are only weakly correlated throughout the movie, suggesting that the vision and language embeddings are each predicting distinct variance in the neural responses.
6/n
6/n
We find that vision embeddings dominate prediction across cortex. Surprisingly, even language-selective regions were well predicted by vision model embeddings, as well as or better than language model features.
5/n
5/n
September 24, 2025 at 7:51 PM
We find that vision embeddings dominate prediction across cortex. Surprisingly, even language-selective regions were well predicted by vision model embeddings, as well as or better than language model features.
5/n
5/n
We densely labeled the vision and language features of the movie using a combination of human annotations and vision and language deep neural network (DNN) models and linearly mapped these features to fMRI responses using an encoding model
4/n
4/n
September 24, 2025 at 7:49 PM
We densely labeled the vision and language features of the movie using a combination of human annotations and vision and language deep neural network (DNN) models and linearly mapped these features to fMRI responses using an encoding model
4/n
4/n
To address this, we collected fMRI data from 34 participants while they watched a 45- minute naturalistic audiovisual movie. Critically, we used functional localizer experiments to identify social interaction perception and language-selective regions in the same participants.
3/n
3/n
September 24, 2025 at 7:47 PM
To address this, we collected fMRI data from 34 participants while they watched a 45- minute naturalistic audiovisual movie. Critically, we used functional localizer experiments to identify social interaction perception and language-selective regions in the same participants.
3/n
3/n
Humans effortlessly extract social information from both the vision and language signals around us. However, most work (even most naturalistic fMRI encoding work) is limited to studying unimodal processing. How does the brain process simultaneous multimodal social signals?
2/n
2/n
September 24, 2025 at 7:46 PM
Humans effortlessly extract social information from both the vision and language signals around us. However, most work (even most naturalistic fMRI encoding work) is limited to studying unimodal processing. How does the brain process simultaneous multimodal social signals?
2/n
2/n