Lightnews — Scholar-powered news

T. Anderson Keller

@andykeller.bsky.social

330 followers 320 following 17 posts

Postdoctoral Fellow at Harvard Kempner Institute. Trying to bring natural structure to artificial neural representations. Prev: PhD at UvA. Intern @ Apple MLR, Work @ Intel Nervana

Posts Replies Media Videos

T. Anderson Keller

@andykeller.bsky.social

We found that wave-based models converged much more reliably than deep CNNs, and even outperformed U-Nets with similar numbers parameter when pushed to their limits. We hypothesize that this is due to the parallel processing ability that wave-dynamics confer and other CNNs lack.

11/14

Tables from the paper comparing wave based models and baselines (CNNs and U-Nets) on a variety of semantic segmentation tasks

March 10, 2025 at 3:34 PM

T. Anderson Keller

@andykeller.bsky.social

As a first step towards the answer, we used the Tetris-like dataset and variants of MNIST to compare the semantic segmentation ability of these wave-based models (seen below) with two relevant baselines: Deep CNNs w/ large (full-image) receptive fields, and small U-Nets.

10/14

March 10, 2025 at 3:34 PM

T. Anderson Keller

@andykeller.bsky.social

Was this just due to using Fourier transforms for semantic readouts, or wave-biased architectures? No! The same models with LSTM dynamics and a linear readout of the hidden-state timeseries still learned waves when trying to semantically segment images of Tetris-like blocks!

8/14

March 10, 2025 at 3:34 PM

T. Anderson Keller

@andykeller.bsky.social

Looking at the Fourier transform of the resulting neural oscillations at each point in the hidden state, we then saw that the model learned to produce different frequency spectra for each shape, meaning each neuron really was able to 'hear' which shape it was a part of!

7/14

Plot of five representative frequency bins from the FFT of the dynamics of our wave-RNN on the shape task. We see different shapes pop out in different bins, indicating that they 'sound' different, and allowing the model to uniquely classify each shape. On the right we plot the average FFT for each pixel, separated by each shape, over the whole dataset, showing that different shapes do have measurably different frequency spectra, even in this average case.

March 10, 2025 at 3:34 PM

T. Anderson Keller

@andykeller.bsky.social

We made wave dynamics flexible by adding learned damping and natural frequency encoders, allowing hidden state dynamics to adapt based on the input stimulus. On simple polygon images, we found the model learned to use these parameters to produce shape-specific wave dynamics:

6/14

March 10, 2025 at 3:34 PM

T. Anderson Keller

@andykeller.bsky.social

To test this, we needed a task; so we opted for semantic segmentation on large images, but crucially with neurons having very small one-step receptive fields. Thus, if we were able to decode global shape information from each neuron, it must be coming from recurrent dynamics.

5/14

Visualization of the input stimuli to our network (left) and the target segmentation labels by color (right). The receptive field of the final layer neurons in our model is plotted as the yellow box, demonstrating that a single neuron has no way to know what shape it may be a part of simply from its local neighborhood, and therefore will require global integration of information over time to solve the task.

March 10, 2025 at 3:34 PM

T. Anderson Keller

@andykeller.bsky.social

We found that, in-line with theory, we could reliably predict the area of the drum analytically by looking at the fundamental frequency of oscillations of each neuron in our hidden state. But is this too simple? How much further can we take it if we add learnable parameters?

4/14

Visualization of the same wave-based RNN on two drums of different sizes (13 and 33 side length respectively). In the middle (in purple) we show the displacement of the drum head at a point just off the center, and (in red) the theoretical fundamental frequency of vibration that we can analytically derive for a square of side length L plotted. On the right we show the Fourier transform of these time-series dynamics, showing the frequency peak in the expected location. This validates we can estimate the size of a drum head from the frequency spectrum of vibration at any point.

March 10, 2025 at 3:34 PM

T. Anderson Keller

@andykeller.bsky.social

Inspired by Mark Kac’s famous question, "Can one hear the shape of a drum?" we thought: Maybe a neural network can use wave dynamics to integrate spatial information and effectively "hear" visual shapes... To test this, we tried feeding images of squares to a wave-based RNN:

3/14

March 10, 2025 at 3:34 PM

T. Anderson Keller

@andykeller.bsky.social

In the physical world, almost all information is transmitted through traveling waves -- why should it be any different in your neural network?

Super excited to share recent work with the brilliant @mozesjacobs.bsky.social: "Traveling Waves Integrate Spatial Information Through Time"

1/14

March 10, 2025 at 3:34 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news