Lightnews — Scholar-powered news

Adeel Razi

@adeelrazi.bsky.social

Congratulations and looking forward to seeing what you do there!

September 24, 2025 at 6:48 AM

Adeel Razi

@adeelrazi.bsky.social

This is for our collaboration with @wellcomeleap.bsky.social on Untangling Addiction.

wellcomeleap.org/ua/program/

Untangling Addiction Program Details | Wellcome Leap: Unconventional Projects. Funded at Scale.

NEW $50M PROGRAMUntangling AddictionWe are pleased to announce the selected performers.Yasmin Hurd, Icahn School of Medicine at Mount SinaiBrett Ginsburg, University of Texas Health Science Center at ...

wellcomeleap.org

July 30, 2025 at 10:46 PM

Adeel Razi

@adeelrazi.bsky.social

That's really intersting and relevant, will read closely and cite in the related work. We currently cited this one fo binary NNs: arxiv.org/abs/2002.10778

Training Binary Neural Networks using the Bayesian Learning Rule

Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the d...

arxiv.org

May 27, 2025 at 10:40 AM

Adeel Razi

@adeelrazi.bsky.social

Reg batchnorm: it's effective in many settings, but can be brittle in others, like when used with small batch sizes, non-i.i.d. data or models with stochasticity in the forward pass. In these cases, the running estimates of mean/variance can drift or misalign with test-time behaviour.

2/2

May 27, 2025 at 7:49 AM

Adeel Razi

@adeelrazi.bsky.social

Yes, absolutely, "noisy" was shorthand & it does depend on the surrogate. What I meant is that common surrogates can have high gradient variance, especially when their outputs saturate. That variance can hurt learning, particularly in deeper networks or those with binary/stochastic activations.
1/2

May 27, 2025 at 7:47 AM

Adeel Razi

@adeelrazi.bsky.social

of course, whenever you could!

May 26, 2025 at 7:43 AM

Adeel Razi

@adeelrazi.bsky.social

Paper: arxiv.org/abs/2505.17962

We’d love feedback, extensions, or critiques.

@neuralreckoning.bsky.social @fzenke.bsky.social @wellingmax.bsky.social

#NeuroAI

6/6

A Principled Bayesian Framework for Training Binary and Spiking Neural Networks

We propose a Bayesian framework for training binary and spiking neural networks that achieves state-of-the-art performance without normalisation layers. Unlike commonly used surrogate gradient methods...

arxiv.org

May 26, 2025 at 4:04 AM

Adeel Razi

@adeelrazi.bsky.social

Why does KL divergence show up everywhere in machine learning?

Because it's not just a distance, it's the cost of believing your own model too much.

Minimizing KL = reducing surprise = optimizing variational free energy.

A silent principle behind robust inference.

5/6

May 26, 2025 at 4:04 AM

Adeel Razi

@adeelrazi.bsky.social

Our key innovation:

- A family of importance-weighted straight-through estimators (IW-ST), which unify and generalize previous methods.
- No need for backprop-through-noise tricks.
- No batch norm.

Just clean, effective training.

4/6

May 26, 2025 at 4:04 AM

Adeel Razi

@adeelrazi.bsky.social

We view training as Bayesian inference, minimizing KL divergence between a posterior and an amortized prior.

This lets us derive a principled loss from first principles—grounded in variational free energy, not heuristics.

3/6

May 26, 2025 at 4:04 AM

Adeel Razi

@adeelrazi.bsky.social

Binary/spiking neural networks are efficient and brain-inspired—but notoriously difficult to train.

Why? Discrete activations → non-differentiable.

Most current methods either approximate gradients or add noisy surrogates.

We do something different.

2/6

May 26, 2025 at 4:04 AM

Adeel Razi

@adeelrazi.bsky.social

If brains infer control by predicting their own actions,
should future AI do the same?

Instead of optimizing over actions,
let’s build agents that explain their sensations.

Intelligence may not be about control—but coherence.

#AgencyByInference

May 25, 2025 at 11:01 AM

Adeel Razi

@adeelrazi.bsky.social

Maybe intelligence isn’t about maximizing reward…
but minimizing surprise in a world we predictively model.

What if agency is not learned—but inferred?

May 25, 2025 at 11:01 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news