Lightnews — Scholar-powered news

Christina Sartzetaki

@sargechris.bsky.social

PhD candidate @ UvA 🇳🇱, ELLIS 🇪🇺 | {video, neuro, cognitive}-AI
Neural networks 🤖 and brains 🧠 watching videos

🔗 https://sites.google.com/view/csartzetaki/

Posts Replies Media Videos

Christina Sartzetaki

@sargechris.bsky.social

9/ This is our first research output in this interesting new direction and I’m actively working on this - so stay tuned for updates and follow-up works! Feel free to discuss your ideas and opinions with me ⬇️

December 11, 2024 at 4:13 PM

Christina Sartzetaki

@sargechris.bsky.social

8/ 🎯 With this work we aim to forge a path that widens our understanding of temporal and semantic video representations in brains and machines, ideally leading towards more efficient video models and more mechanistic explanations of processing in the human brain.

December 11, 2024 at 4:13 PM

Christina Sartzetaki

@sargechris.bsky.social

7/ We report a significant negative correlation of model FLOPs to alignment in several high-level brain areas, indicating that computationally efficient neural networks can potentially produce more human-like semantic representations.

December 11, 2024 at 4:13 PM

Christina Sartzetaki

@sargechris.bsky.social

6/ Training dataset biases related to a certain functional selectivity (e.g. face features) can be transferred in brain alignment with the respective functionally selective brain area (e.g. face region FFA).

December 11, 2024 at 4:13 PM

Christina Sartzetaki

@sargechris.bsky.social

5/ Comparing model architectures, CNNs exhibit a better hierarchy overall (with a clear mid-depth peak for early regions and gradual improvement as depth increases for late regions). Transformers however, achieve an impressive correlation to early regions even from one tenth of layer depth.

December 11, 2024 at 4:13 PM

Christina Sartzetaki

@sargechris.bsky.social

4/ We decouple temporal modeling from action space optimization by adding image action recognition models as control. Our results show that temporal modeling is key for alignment to early visual brain regions, while a relevant classification task is key for alignment to higher-level regions.

December 11, 2024 at 4:13 PM

Christina Sartzetaki

@sargechris.bsky.social

3/ We disentangle 4 factors of variation (temporal modeling, classification task, architecture, and training dataset) that affect model-brain alignment, which we measure by conducting Representational Similarity Analysis (RSA) across multiple brain regions and model layers.

December 11, 2024 at 4:13 PM

Christina Sartzetaki

@sargechris.bsky.social

2/ We take a step in this direction by performing a large-scale benchmarking of models on their representational alignment to the recently released Bold Moments Dataset of fMRI recordings from humans watching videos.

December 11, 2024 at 4:13 PM

Christina Sartzetaki

@sargechris.bsky.social

1/ Humans are very efficient in processing continuous visual input, neural networks trained to process videos are still not up to that standard. What can we learn from comparing the internal representations of the two systems (biological and artificial)?

December 11, 2024 at 4:13 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news