Sushrut Thorat
banner
sushrutthorat.bsky.social
Sushrut Thorat
@sushrutthorat.bsky.social
Recurrent computations and lifelong learning.
Postdoc at IKW-UOS@DE with @timkietzmann.bsky.social
Prev. Donders@NL‬, ‪CIMeC@IT‬, IIT-B@IN
neon-color-spreading. Hard to be sure although there's more positive signal in the center BUT the class prediction stays "wire"...
November 18, 2025 at 9:48 PM
I did a quick check with the BLT_VS trained on Ecoset (github.com/KietzmannLab...).

Visualizing the feedback at the second "LGN" layer and printing the predicted class. The feedback doesn't seem to show the illusory contour but the class, interestingly changes from guitar to lamp to axe??
November 18, 2025 at 9:42 PM
esp. because the "confidence increase due to feedback" picture painted here - bsky.app/profile/tahe... - is eerily similar to what we expected (and found) in the BLTs in arxiv.org/abs/2111.07898

Very curious!
November 18, 2025 at 8:26 PM
GPN alignment beyond VVC (NSD streams ROIs): better than input embeddings/GSNs in all ROIs; better than SOTA in parietal/midparietal; on par with SOTA in midventral/midlateral/lateral; worse only in early visual cortex (expected due to low-level features). 11/14
November 18, 2025 at 12:37 PM
GPN-R-SimCLR isn't just a SOTA model of ventral scene representations, but it also largely subsumes the variance explained by all the other models (variance partitioning; green edged-squares: unique variance)! Universality? Language-based codes <= co-occurrence of visual scene parts? 10/14
November 18, 2025 at 12:37 PM
Equating the architecture and dataset, but switching objective from glimpse prediction to caption embedding (MPNet; sGSN) or multi-class object prediction (cGSN), reduces the alignment. Furthermore, no related/SOTA model (36 tested; Table S5) outperforms GPN-R-SimCLR => a new SOTA model! 9/14
November 18, 2025 at 12:37 PM
We assess RDM alignment with the 'ventral' visual cortex (VVC) RDMs, across GPN variants (and glimpse embedding backbones). GPN representations align better than the input glimpse embeddings (dotted black lines) => GPN contextualization & integration creates VVC-aligned scene repr.! 8/14
November 18, 2025 at 12:37 PM
Glimpse embeddings are contextualized (wrt their relations) and integrated, resulting in a "scene representation". Do GPN repr. align with natural scene repr. in human visual cortex? We turn to the Natural Scenes Dataset (NSD) and Representational Similarity Analysis (RSA). 7/14
November 18, 2025 at 12:37 PM
GPN predictions align with embedding of the next-glimpse (given saccade) > other glimpses from the same scene > glimpses from other scenes => co-occurrence (+ spatial arrangement, with S) learning. With R, prediction loss decreases over glimpses => integration. 6/14
November 18, 2025 at 12:37 PM
Glimpse Prediction Networks (GPNs) take high-level glimpse embedding & optionally, planned saccade S, as inputs, and predict the high-level next-glimpse embedding, optionally using recurrence (R) carrying state across glimpses. Glimpses = COCO scene crops around DeepGaze3 fixations. 5/14
November 18, 2025 at 12:37 PM
A hard ARC problem from Fig. 1 of www.nature.com/articles/s41...

Am I the only one who thinks in the test solution, the “overtaken” dots could be red?
October 29, 2025 at 8:49 AM
Thanks for engaging :)
In Geirhos's cc images, the texture doesn't have to only be high-freq. The gram matrices are aligned across all layers - in later layers the RF sizes are huge so the correlations needn't necessarily only reflect small-scale variation, as seen in my post.
October 7, 2025 at 3:40 PM
hmm,
1. the way they quantify "texture" is based solely on high-freq components. but, there are low-freq components which do not signal meaningful information about shape either and could influence classification (suppl fig. from upcoming rev. of arxiv.org/abs/2507.03168)
October 7, 2025 at 11:55 AM
I was wondering if Alexnet is not the same in Geirhos and our checks compared to yours. Indeed, both the channel# and RF sizes have been changed. Specifically, larger RFs might definitely help with shape bias. Indeed, when the original RFs are used, the shape bias seems to drop to ~0.4 7/
July 10, 2025 at 7:51 AM
Won't this inflate the reported shape and texture accuracies, and change the shape bias, as compared to say what Geirhos reports (e.g. in proceedings.neurips.cc/paper_files/...)

Alexnet seems to barely have a shape bias of ~0.3, whereas your Fig. 2 suggest a shape bias of 0.5! 6/
July 10, 2025 at 7:51 AM
So why are our results different?

I looked into the way shape bias was computed in your paper. I have a few questions:

"We selected the class with the highest probability from this subset and mapped it to one of the corresponding 16 categories." -> so the accuracy was not computed 1000-way? 3/
July 10, 2025 at 7:51 AM
As you might've seen, we too recently found that devo considerations massively help with shape bias. However, more than visual acuity or color, contrast sensitivity was found to be key - bsky.app/profile/sush... In fact, color+blur doesn't get us above 0.5! 2/
July 10, 2025 at 7:51 AM
If this early experience is so critical for humans to acquire an entire, useful, ability (or bias), might it be useful for computer vision systems? It just so happens that current neural networks lack this bias—when shown cue-conflict images, their inference is texture-based. Can we help? 2/7
July 8, 2025 at 3:09 PM
I don't think one would think much of the blurry, underdeveloped vision that babies have. But apparently, if you miss a few months of vision after birth, you acquire configural processing deficits (you cannot readily distinguish faces based on relative positioning of the nose, lips, and eyes)! 1/7
July 8, 2025 at 3:09 PM
Woah this is insane! @tessamdekker.bsky.social this might be of interest!
June 29, 2025 at 3:33 PM
Was Neel’s response to these two tweets.
June 4, 2025 at 7:34 PM
100% on Jupyter notebooks, esp. ones which can just work with Google Colab! I'm not yet sold on notebook-pubs or even on the Anthropic web releases. It's so much fun reading a hard copy detailing the core messages of the work, then go play with the data/model if interested.
June 4, 2025 at 4:36 PM
This is a great example for the utility of preprints.
"and it wasn’t just peer reviewed, it was peer tested." - RELEASE YOUR DATA & CODE!!!
June 4, 2025 at 4:32 PM
... (Claude/Gemini do the same)
January 16, 2025 at 10:49 AM
This is an interesting paper from CCN this year. Curious to see it fleshed out - 2024.ccneuro.org/pdf/595_Pape...

"Euclidean coordinates are the wrong prior for models of primate vision"
December 2, 2024 at 8:06 PM