Raj Magesh
raj-magesh.org
Raj Magesh
@raj-magesh.org
Cognitive computational neuroscience | Postdoc at Pitt advised by Marlene Behrmann; PhD at JHU advised by Mick Bonner

https://raj-magesh.org
Yeah, definitely!

An relevant paper along these lines is www.nature.com/articles/nat..., where they show dimensionality collapse on error trials in monkey PFC representations!
The importance of mixed selectivity in complex cognitive tasks - Nature
When an animal is performing a cognitive task, individual neurons in the prefrontal cortex show a mixture of responses that is often difficult to decipher and interpret; here new computational methods...
www.nature.com
December 18, 2025 at 5:48 PM
Sorry, I'd missed this sub-thread!

Yes, several prior reports of low-D representations were because of deliberate constraints to measure behavioral relevance. Here, we only consider cross-trial/cross-subject reliability, not task-related constraints (a very interesting Q in its own right).
December 18, 2025 at 5:47 PM
Also: the ease of reaching out to the devs who *actually wrote* the software and getting timely responses from them.

And how easy it is to contribute bugfixes.

Even if the frequency of bugs is higher, the total annoyance is much lower, perhaps because I feel like I have agency.

Long live FOSS!
December 17, 2025 at 10:43 PM
I think more the latter than the former.

But my point is simpler: I think neuroscience experiments often yield low-D manifolds because of simplicity in inputs (e.g. carefully controlled stimuli) and easy tasks. I expect naturalistic stimuli and behaviors would elicit more high-D representations.
December 15, 2025 at 11:05 PM
I agree that relative measures are cleaner to measure and easier to interpret!

Our point in this paper is mainly that the absolute dimensionality is much higher than previously thought throughout visual cortex! And so we might need different approaches to understand these high-D data.
December 12, 2025 at 9:51 PM
Yeah, given all the limitations, it's amazing how there's still so much stimulus-related information in BOLD signals!

In Fig S12 (journals.plos.org/ploscompbiol...) we find power-law spectra in a monkey electrophysiology dataset too.

And the same in mouse Ca-imaging: www.nature.com/articles/s41...
journals.plos.org
December 12, 2025 at 5:02 PM
But also, we're binning the eigenspectrum heavily to measure this small-but-nonzero signal in the tail!

This is a tradeoff: we lose spectral resolution but at least we can measure the signal there.
December 12, 2025 at 4:54 PM
The nice thing about the estimator we're using in the paper is that if there is no stimulus-related signal (i.e. generalizes across repeated presentations and new stimuli), the expected value of the variance is 0.

So what we're seeing significantly above zero is not noise.
December 12, 2025 at 4:53 PM
Ohhh I see what you meant! I've been using "high" and "low" variance to refer to the first few dimensions and the tail of the eigenspectrum respectively.

Yeah, in principle, noise should definitely inflate the tail of the eigenspectrum (also the rest, but less noticeably).
December 12, 2025 at 4:51 PM
Thanks!

The cross-decomposition method we're using measures variance that generalizes (i) across multiple presentations of the stimuli and (ii) to a held-out test set, so I'm not too worried about that---we are measuring only stimulus-related signal.

(I think you meant low variance?)
December 12, 2025 at 4:40 PM
Yep, I think many tasks often used in neuroscience won't require attention to many features, but actual naturalistic behavior is probably way more high-dimensional.

www.pnas.org/doi/full/10....
PNAS
Proceedings of the National Academy of Sciences (PNAS), a peer reviewed journal of the National Academy of Sciences (NAS) - an authoritative source of high-impact, original research that broadly spans...
www.pnas.org
December 12, 2025 at 4:38 PM
I'll refactor it into a standalone tool at some point when I get the time. 🙃

But the sklearn implementation is likely sufficient for most purposes.
December 12, 2025 at 4:20 PM
I think the best place to start would be this implementation of cross-decomposition in sklearn: scikit-learn.org/stable/modul...

I've written a GPU-accelerated version that does other stuff too (permutation tests, etc.) but it's unfortunately not quite plug-and-play (github.com/BonnerLab/sc...).
PLSSVD
scikit-learn.org
December 12, 2025 at 4:19 PM
Yep, at some point in the process, the relevant info must be extracted for task purposes, and a low-D manifold is what I'd expect to see there. Though it seems that throughout visual cortex at least, the code remains pretty high-dimensional (though how much ends up being used on a task is unclear).
December 12, 2025 at 2:21 PM
Also, while I think many would agree visual representations are high-dimensional, often our datasets and tools have been too limited to detect it.

Estimates of visual cortex dimensionality have traditionally been much lower (~10s-100), not the unbounded power-law we're reporting here.
December 12, 2025 at 2:18 PM
I tend to think of these representations as being a rich, general-purpose feature bank that can be easily read out from for a variety of tasks. But yeah, I'm sure different latent subspaces are differentially activated based on task demands.
December 12, 2025 at 2:15 PM
Yeah, that's an important point! Our analysis here only measures reliability of the representation across trials/held-out stimuli, not whether the info is used for downstream processing.

I'm also curious how dimensionality depends on task demands, but that's hard to answer with this dataset.
December 12, 2025 at 2:13 PM
But also, networks do have pretty high-dimensional representations in general, often with power-law statistics too!

A nice example is in proceedings.neurips.cc/paper_files/...
$\alpha$-ReQ : Assessing Representation Quality in Self-Supervised Learning by measuring eigenspectrum decay
proceedings.neurips.cc
December 12, 2025 at 2:07 PM
Yeah, compression of info is something that often happens close to the final layers of DNNs, likely because networks are often trained on a more limited task than an open-ended system like our brains.

e.g. networks trained on CIFAR-10 often end up lower-dimensional than those trained on CIFAR-100
December 12, 2025 at 2:05 PM
Yeah, there are definitely analogous findings in DNNs!

I particularly like Figure 7 in arxiv.org/abs/2204.06125 as an example of high-dimensional representations being useful in DNNs.
Hierarchical Text-Conditional Image Generation with CLIP Latents
Contrastive models like CLIP have been shown to learn robust representations of images that capture both semantics and style. To leverage these representations for image generation, we propose a two-s...
arxiv.org
December 12, 2025 at 5:02 AM
(Apologies, I'm not sure if I'm threading correctly; I'm splitting up a single response into multiple comments due to the incredibly low character limit (!?!))
December 12, 2025 at 4:58 AM
There was some variation in the size of V1 across participants but that shouldn't affect our results beyond having less data to estimate dimensionality when there are fewer voxels.

I'm not quite sure what you meant about V4; could you elaborate or point me to a paper?
December 12, 2025 at 4:55 AM