Marco Cuturi
marcocuturi.bsky.social
Marco Cuturi
@marcocuturi.bsky.social
machine learning researcher @ Apple machine learning research
We also introduce two coupling approaches advocated this summer to improve FM training: using either very large sharp Sinkhorn couplings (arxiv.org/abs/2506.05526) or, even better, semidiscrete couplings (arxiv.org/abs/2509.25519), as proposed with Alireza Mousavi-Hosseini and
@syz.bsky.social
On Fitting Flow Models with Large Sinkhorn Couplings
Flow models transform data gradually from one modality (e.g. noise) onto another (e.g. images). Such models are parameterized by a time-dependent velocity field, trained to fit segments connecting pai...
arxiv.org
November 5, 2025 at 2:04 PM
Reposted by Marco Cuturi
Afternoon talks by:
@marcocuturi.bsky.social
Elena Agliari
Jan Gerken

Thanks all for the great talks, conversations, and engagement! Fingers crossed we get to host this event a 4th time next year and see many of you back in Gothenburg 🤞🇸🇪
October 29, 2025 at 8:58 PM
Then there's always 𝜀 regularization. When 𝜀=∞, we recover vanilla FM. At this point we're not completely sure whether 𝜀=0 is better than 𝜀>0, they both work! 𝜀=0 has a minor edge in larger scales (sparse gradients, faster assignment, slightly better metrics), but 𝜀>0 is also useful (faster SGD)
October 4, 2025 at 11:21 AM
Thanks for the nice comments! my interpretation is that we're using OT to produce pairs (x_i,y_i) to guide FM. With that, it's up to you to provide an inductive bias (a model) that gets f(x)~=y while generalizing. The hard OT assignment could be that model, but it would fail to generalize.
October 4, 2025 at 11:21 AM
for people that like OT, IMHO the very encouraging insight is that we have evidence that the "better" you solve your OT problem, the more flow matching metrics improve, this is Figure 3
October 4, 2025 at 8:45 AM
Thanks @rflamary.bsky.social! yes, exactly. We try to summarize this tradeoff in Table 1, in which we show that for a one-off preprocessing cost, we now get all (noise,data) pairings you might need during flow matching training for "free" (up to the MIPS lookup for each noise).
October 4, 2025 at 8:44 AM
the paper is out: arxiv.org/abs/2509.25519

Michal also did a fantastic push to open source the semidiscrete solver prepared by Stephen and Alireza in the OTT-JAX library. We plan to open source the flow pipeline in JAX soon. Please reach out if interested!
Flow Matching with Semidiscrete Couplings
Flow models parameterized as time-dependent velocity fields can generate data from noise by integrating an ODE. These models are often trained using flow matching, i.e. by sampling random pairs of noi...
arxiv.org
October 3, 2025 at 9:02 PM
This much faster than using Sinkhorn, and generates with higher quality.

As a bonus, you can forget about entropy regularization (set ε=0), apply things like correctors to guidance, and use it on consistency-type models, or even with conditional generation.
October 3, 2025 at 9:00 PM
the great thing with SD-OT is that this only needs to be computed once. You only need to store a real number per data sample. You can precompute these numbers once & for all using stochastic convex optimization.

When training a flow model, you assign noise to data using these numbers.
October 3, 2025 at 8:56 PM
In practice, however, this idea only begins to work when using massive batch sizes (see arxiv.org/abs/2506.05526). The problem is that the costs of running Sinkhorn on millions of points can quickly balloon...

Our solution? rely on semidiscrete OT at scales that were never considered before.
October 3, 2025 at 8:56 PM
you're right that the PCs' message uses space as a justification to accept less papers, but it does not explicitly mention that the acceptance rate should be lower than the historical standard of 25%. In my SAC batch, the average acceptance before their email was closer to 30%, but that's just me..
August 29, 2025 at 11:32 AM
I see it a bit differently. The new system pushed reviewers aggressively to react to rebuttals. I think this is a great change, but this has clearly skewed results, creating many spurious grade upgrades. Now the system must be rebalanced in the other direction by SAC/AC for results to be fair..
August 29, 2025 at 7:05 AM