Lightnews — Scholar-powered news

Anirbit

@anirbit.bsky.social

With all renewed discussion about "Sparse AutoEncoders (#SAE)" as a way of doing #MechanisticInterpretability of #LLMs, I am resharing a part of my PhD where we proved years ago about how sparsity automatically emerges in autoencoding.

arxiv.org/abs/1708.03735

Sparse Coding and Autoencoders

In "Dictionary Learning" one tries to recover incoherent matrices $A^* \in \mathbb{R}^{n \times h}$ (typically overcomplete and whose columns are assumed to be normalized) and sparse vectors $x^* \in ...

arxiv.org

October 3, 2025 at 2:16 PM

Anirbit

@anirbit.bsky.social

Registrations close for #DRSciML by Noon (Manchester time). Do register soon to ensure you get the Zoom links to attend this exciting event on the foundations of #ScientificML 💥

Anirbit @anirbit.bsky.social · Aug 21

Registrations are now open for the international workshop on foundations of #AI4Science #SciML that we are hosting with Prof. Jakob Zech. In-person seats are very limited, please do register to join online 💥

drsciml.github.io/drsciml/

DRSciML

drsciml.github.io

September 8, 2025 at 8:41 AM

Anirbit

@anirbit.bsky.social

Recently I gave an online talk @
India's premier institute IISc 's "Bangalore Theory Seminars" where I explained our results on size lowerbounds for neural models of solving PDEs via neural nets. #SciML #AI4SCIENCE I cover work by one of my, 1st year PhD student, Sebastien.

youtu.be/CWvnhv1nMRY?...

Provable Size Requirements for Operator Learning and PINNs, by Anirbit Mukherjee

YouTube video by CSAChannel IISc

youtu.be

September 4, 2025 at 8:16 PM

Anirbit

@anirbit.bsky.social

Today is 70th anniversary of the summer meeting at Dartmouth which officially marked the beginning of AI research 💥 Interestingly "Objective 3" in 1955 was already about having theory of neural nets. 🙂
stanford.io/2WJJJGN

stanford.io

August 31, 2025 at 5:52 PM

Anirbit

@anirbit.bsky.social

I got selected for the "Early Career Highlights" of ACM IKDD International Conference on Data Science (CODS) 2025. Looking forward to the talk at IISER, Pune in December.

www.acm.org/articles/acm...

Call for Papers & Proposals - IKDD CODS 2025

Call for papers and proposals announced for CODS 2025 across various tracks of the conference. The next edition of the conference will be held in IISER Pune on December 17-20, 2025. Go through the det...

www.acm.org

August 27, 2025 at 9:14 AM

Anirbit

@anirbit.bsky.social

Why does noisy gradient-descent train neural nets? This fundamental question in ML remains unclear.

In our hugely revised draft my student @dkumar9.bsky.social gives the full proof that a form of noisy-GD, Langevin Monte-Carlo (#LMC), can learn arbitrary depth 2 nets.

arxiv.org/abs/2503.10428

Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data

In this work, we will establish that the Langevin Monte-Carlo algorithm can learn depth-2 neural nets of any size and for any data and we give non-asymptotic convergence rates for it. We achieve this ...

arxiv.org

August 22, 2025 at 3:59 PM

Anirbit

@anirbit.bsky.social

Registrations are now open for the international workshop on foundations of #AI4Science #SciML that we are hosting with Prof. Jakob Zech. In-person seats are very limited, please do register to join online 💥

drsciml.github.io/drsciml/

DRSciML

drsciml.github.io

August 21, 2025 at 10:26 AM

Anirbit

@anirbit.bsky.social

Please do get in touch if you have published paper(s) on solving singularly perturbed PDEs using neural nets. #AI4Science #SciML

August 18, 2025 at 9:05 AM

Anirbit

@anirbit.bsky.social

Some luck to be hosted by a Godel Prize winner, Prof. Sebastien Pokutta, and to present our work in their group 💥 Sebastien heads this "Zuse Institute Berlin (#ZIB) " which is an amazing oasis of applied mathematics bringing together experts from different institutes in Berlin.

August 7, 2025 at 4:34 PM

Reposted by Anirbit

Centre for AI Fundamentals - Manchester

@aifunmcr.bsky.social

Interested in statistics? Prof Subhashis Ghoshal will be delivering the below public lecture tomorrow:

Title: Immersion posterior: Meeting Frequentist Goals under Structural Restrictions
Time: Aug 5 16:00-17:00
Abstract: www.newton.ac.uk/seminar/45562/
Livestream: www.newton.ac.uk/news/watch-l...

August 4, 2025 at 10:45 AM

Anirbit

@anirbit.bsky.social

Hello #FAU. Thanks for the quick plan to host me and letting me present our exciting mathematics of ML in infinite-dimensions, #operatorlearning. #sciML Their "Pattern Recognition Laboratory" is completing 50 years! @andreasmaier.bsky.social 💥

August 2, 2025 at 6:31 PM

Anirbit

@anirbit.bsky.social

University of Manchester has a 1 year post-doc position that I am happy to support in our group if you are currently an #EPSRC funded PhD student - and have the required specialization for work in our group. Typicall we prefer candidates who have published in deep-learning theory or fluid theory.

July 24, 2025 at 1:23 PM

Anirbit

@anirbit.bsky.social

Do mark your calendars for "DRSciML" (Dr. Scientific ML 😉) on September 9 and 10 🔥
drsciml.github.io/drsciml/
- We are hosting a 2 day international workshop on understanding scientific-ML.
- We have leading experts from around the world giving talks.
- There might be ticketing. Watch this space!

DRSciML

drsciml.github.io

July 23, 2025 at 4:53 PM

Anirbit

@anirbit.bsky.social

Major ML journals that have come up in the recent years,

- dl.acm.org/journal/topml
- jds.acm.org
- link.springer.com/journal/44439
- academic.oup.com/rssdat
- jmlr.org/tmlr/
- data.mlr.press

No reason why these cant replace everything the current conferences are doing and most likely better.

July 6, 2025 at 7:41 PM

Anirbit

@anirbit.bsky.social

So, the next time you train a deep-learning model, it's probably worthwhile to have a baseline for the only provable adaptive gradient deep-learning algorithm - our delta-GClip 🙂

Anirbit @anirbit.bsky.social · Jun 29

Our "delta-GCLip" is the *only* known adaptive gradient algorithm that provably trains deep-nets AND is practically competitive. That's the message of our recently accepted #TMLR paper - and my 4th TMLR journal 🙂

openreview.net/pdf?id=ABT1X...

#optimization #deeplearningtheory

openreview.net

July 1, 2025 at 8:55 AM

Anirbit

@anirbit.bsky.social

Our insight is to introduce an intermediate form of gradient clipping that can leverage the PL* inequality of wide nets - something not known for standard clipping. Given our algorithm works for transformers maybe that points to some yet unkown algebraic property of them. #TMLR

June 29, 2025 at 10:38 PM

Anirbit

@anirbit.bsky.social

Our "delta-GCLip" is the *only* known adaptive gradient algorithm that provably trains deep-nets AND is practically competitive. That's the message of our recently accepted #TMLR paper - and my 4th TMLR journal 🙂

openreview.net/pdf?id=ABT1X...

#optimization #deeplearningtheory

openreview.net

June 29, 2025 at 10:36 PM

Anirbit

@anirbit.bsky.social

An updated version of our slides on necessary conditions for #SciML,
- and more specially,
"Machine Learning in Function Spaces/Infinite Dimensions".

Its all about the 2 key inequalities on slides 27 and 33.
Both come via similar proofs.

github.com/Anirbit-AI/S...

GitHub - Anirbit-AI/Slides-from-Team-Anirbit: Slide Presentations of Our Works

Slide Presentations of Our Works. Contribute to Anirbit-AI/Slides-from-Team-Anirbit development by creating an account on GitHub.

github.com

June 23, 2025 at 10:02 PM

Anirbit

@anirbit.bsky.social

Now our research group has a logo to succinctly convey what we do - prove theorems about using ML to solve PDEs, leaning towards operator learning. Thanks to #ChatGPT4o for converting my sketches into a digital image 🔥 #AI4Science #SciML

June 7, 2025 at 7:03 PM

Anirbit

@anirbit.bsky.social

It would be great to be able to see a compiles list of useful PDEs that #PINNs struggle to solve - and how would we measure success there.

We know of edge-cases with simple PDEs, where PINNs struggle, but then often those aren't the cutting-edge of use-cases of PDEs.

May 24, 2025 at 4:28 PM

Anirbit

@anirbit.bsky.social

A revised version of our delta-GClip algorithm - which is probably the *only* deep-learning algorithm that provably trains deep-nets while using step-size scheduling - and competes/supersedes heuristics like Adam and even Adam+Clipping on transformers.

arxiv.org/abs/2404.08624

Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks

We present and analyze a novel regularized form of the gradient clipping algorithm, proving that it converges to global minima of the loss surface of deep neural networks under the squared loss, provi...

arxiv.org

April 9, 2025 at 9:08 AM

Anirbit

@anirbit.bsky.social

He was the first person to interview me for a PhD position in applied maths and stats when I decided to shift my career focus in that direction. Years later when I became a faculty, he accepted my invite to fly to UK from Leipzig to give a talk and meet my students. Sayan is irreplaceable 😶

Maxim Raginsky @mraginsky.bsky.social · Apr 3

To say that I am shocked at this tragic news is an understatement. I got to know Sayan well during my time at Duke; he was deeply knowledgeable about all sorts of things and had a great sense of humor. He was the one and only Bayesian anarchist. www.mis.mpg.de/news/loss-sa...

We Mourn the Loss of Sayan Mukherjee

We are shocked and saddened by the news that our colleague and Max Planck Fellow Sayan Mukherjee passed away. He was a dear colleague in Leipzig who formed a vital connection between mathematics, comp...

www.mis.mpg.de

April 4, 2025 at 2:02 PM

Anirbit

@anirbit.bsky.social

Fun @ @aifunmcr.bsky.social 😁

Samuel Kaski @samikaski.bsky.social · Apr 2

We had a fun evening designing logos for @aifunmcr.bsky.social . Would you favour some realism or surrealism?

April 2, 2025 at 1:30 PM

Anirbit

@anirbit.bsky.social

My PhD student @dkumar9.bsky.social has done an amazing job of blending multiple deep results to establish that Langevin Monte-Carlo algorithm provably learns 2-layer nets for any data and for any size. This is very rigorously a "beyond NTK" regime. More to come from Dibyakanti 🙂

Dibyakanti Kumar @dkumar9.bsky.social · Mar 19

Noisy gradient descent has attracted a lot of attention in the last few years as a mathematically tractable model of actual deep-learning algorithms.

In my recent work with @anirbit.bsky.social and Samyak Jha (arxiv.org/abs/2503.10428), we prove noisy gradient descent learns neural nets.

Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data

In this work, we will establish that the Langevin Monte-Carlo algorithm can learn depth-2 neural nets of any size and for any data and we give non-asymptotic convergence rates for it. We achieve this ...

arxiv.org

March 21, 2025 at 12:59 PM

Anirbit

@anirbit.bsky.social

My first year PhD student Sébastien André-sloan presents at a #INFORMS conference in Toronto. Its a joint work with Matthew Colbrook at DAMTP, Cambridge. We prove a first-of-its-kind size requirement on neural nets for solving PDEs in the super-resolution setup - the natural setup for #PINNs.

March 15, 2025 at 7:10 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news