Lightnews — Scholar-powered news

Gergely Neu

@neu-rips.bsky.social

other than that, make sure to check out the really cool lineup of speakers and the program on the website: eurips.cc
i understand that registrations are still open, with some massive discounts (and even fee waivers!) for students. hope to see many of you there!!!

5/5

October 29, 2025 at 11:47 AM

Gergely Neu

@neu-rips.bsky.social

July 28, 2025 at 8:51 AM

Gergely Neu

@neu-rips.bsky.social

dear reviewers, of course that "no pdf" rebuttal policy is not going to stop me from running all those wonderful new baselines you just suggested, please find the results below

July 28, 2025 at 8:24 AM

Gergely Neu

@neu-rips.bsky.social

what's the opposite of FOMO called again? either way, i'm right in the middle of that
(hope everyone else is doing fine at #ICML2025 nevertheless!)

July 16, 2025 at 10:54 PM

Gergely Neu

@neu-rips.bsky.social

we also have some cool model-selection experiments that demonstrate the ability of our approach for computing similarities between Markov chains, which could be potentially very useful for representation learning in RL
10/n

May 26, 2025 at 1:27 PM

Gergely Neu

@neu-rips.bsky.social

interestingly, the variables λ_X and λ_Y can be interpreted as "encoder-decoder" pairs giving the conditional distributions of Y|X and X|Y.
we ran some experiments to illustrate this, e.g., here we use SOMCOT to discover the latent structure of a block Markov chain without any prior knowledge
9/n

May 26, 2025 at 1:27 PM

Gergely Neu

@neu-rips.bsky.social

the algorithm is easy to implement and comes with nice sample-complexity guarantees. notably, no "concentrability" / "coverage" assumptions are required for the state distributions, the guarantees only depend on the sizes of the two chains
8/n

May 26, 2025 at 1:27 PM

Gergely Neu

@neu-rips.bsky.social

.. which we can solve by a stochastic primal-dual method that we call SOMCOT (stochastic optimization for Markov-chain optimal transport). the idea is to observe that all gradients can be computed as *marginal* expectations over transition data, which allows constructing an efficient SGDA scheme
7/n

May 26, 2025 at 1:27 PM

Gergely Neu

@neu-rips.bsky.social

the main idea is rewriting the distance as a linear program whose constraints do not explicitly feature the transition kernels, only the *marginal distributions* of x->x' transitions. this allows us to formulate the problem as a stochastic min-max optimization game...
6/n

May 26, 2025 at 1:27 PM

Gergely Neu

@neu-rips.bsky.social

in our neurips'24 paper, we developed some neat analytic tools an some computational methods to do this, but left a major open question behind: how to estimate distances for *unknown* Markov chains based on *samples only*?
4/n

May 26, 2025 at 1:27 PM

Gergely Neu

@neu-rips.bsky.social

imagine i have two Markov chains M_X and M_Y, have a similarity metric over their state spaces, and want to lift this to a distance between the entire joint distributions. a powerful way to do that is via the tools of optimal transport (OT)
3/n

May 26, 2025 at 1:27 PM

Gergely Neu

@neu-rips.bsky.social

in a 2024 paper (w/ Sergio Calo, Anders Jonsson, Ludovic Schwartz & Javier Segovia-Aguas), we explored the question of defining and computing similarity metrics between Markov chains (arxiv.org/abs/2406.04056). we are back asking the same question, and giving some even more satisfying answers
2/n

May 26, 2025 at 1:27 PM

Gergely Neu

@neu-rips.bsky.social

new work on computing distances between stochastic processes ***based on sample paths only***! we can now:
- learn distances between Markov chains
- extract "encoder-decoder" pairs for representation learning
- with sample- and computational-complexity guarantees
read on for some quick details..
1/n

May 26, 2025 at 1:27 PM

Gergely Neu

@neu-rips.bsky.social

posting something nice soon

May 17, 2025 at 7:48 AM

Gergely Neu

@neu-rips.bsky.social

time to smash that TURBO button

May 15, 2025 at 5:18 PM

Gergely Neu

@neu-rips.bsky.social

looking forward to giving a talk at the 2025 GHOST day in the beautiful city of Poznan (Poland) --- hope to see you there this weekend!

May 6, 2025 at 7:06 PM

Gergely Neu

@neu-rips.bsky.social

"no politics in the classroom", right?
(from Boaz's ALT 2025 keynote)

May 3, 2025 at 2:42 PM

Gergely Neu

@neu-rips.bsky.social

this work was gloriously led by my absolutely ✨brilliant✨ postdocs: Eugenio Clerico and Hamish Flynn. they're both on the job market so do reach out to them if you are looking to recruit some outstanding learning theorists.

my friend Wojtek Kotlowski also helped out quite a lot.

12/

April 23, 2025 at 10:41 PM

Gergely Neu

@neu-rips.bsky.social

the proof is really simple, and possibly even trivial. (just notice that the second term in this decomposition is almost always negative: you can't predict Y_t better than theta* if you have to predict sequentially.)

8/

April 23, 2025 at 10:41 PM

Gergely Neu

@neu-rips.bsky.social

our main result is the following: if i can design an online learning algorithm for this game with regret bounded by B, then i can set my confidence width as beta_n = B + log (1/delta), and end up with a confidence set that's valid with probability at least 1-delta

7/

April 23, 2025 at 10:41 PM

Gergely Neu

@neu-rips.bsky.social

so here's the idea. let us introduce a game of "sequential probability assignment" where an online learner is trying to predict the next Y_t given all the past data, and define its regret against the best fixed strategy in hindsight that always predicts using the MLE

6/

April 23, 2025 at 10:41 PM

Gergely Neu

@neu-rips.bsky.social

now, the problem that motivated the present work is that i can never remember the details of this construction and the proof, so i wanted to find an alternative that reduces the problem to something that i understand reasonably well: ONLINE LEARNING

5/

April 23, 2025 at 10:41 PM

Gergely Neu

@neu-rips.bsky.social

we specifically consider sets that contain all parameters whose likelihood is not much smaller than that of the maximum likelihood estimator. thus we would like to find methods for picking the "cutoff" parameter beta_n in a way that leads to small and valid confidence sets

3/

April 23, 2025 at 10:41 PM

Gergely Neu

@neu-rips.bsky.social

we consider generalized linear models (see pic below), and would like to construct a small set that contains the true parameter theta* with high probability, based on observations (X_t,Y_t)_{t=1}^n

2/

April 23, 2025 at 10:41 PM

Gergely Neu

@neu-rips.bsky.social

new paper online:

"CONFIDENCE SEQUENCES FOR GENERALIZED LINEAR MODELS VIA REGRET ANALYSIS"

TL;DR: we reduce the problem of designing tight confidence sets for statistical models to proving the existence of small regret bounds in an online prediction game

read on for a quick thread 👀👀👀

1/

April 23, 2025 at 10:41 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news