Rémi Flamary
banner
rflamary.bsky.social
Rémi Flamary
@rflamary.bsky.social
ML Professor at École Polytechnique. Python open source developer. Co-creator/maintainer of POT, SKADA. https://remi.flamary.com/
Yesterday @tgnassou.bsky.social successfully defended his PhD thesis on domain adaptation of signals and in particular EEG. Huge congrats to him for all his work and his wonderful slides. It was a pleasure to be his advisor with @agramfort.bsky.social and I can't wait to see what he will do next!
November 7, 2025 at 11:10 AM
Figure 1. Happy ML researcher and open source developer presenting his toolbox SKADA at PyData Paris. Congrats @tgnassou.bsky.social the presentation was awesome!
September 30, 2025 at 3:28 PM
Did not know that all authors that submit 3 ICLR of more must be reviewers with 6 (or more?) papers. Seems fair to review 2 papers per submission but after the full NeurIPS summer (and rejection) I would have appreciated a reduced load to do them better (I say that as an ICML 2025 top reviewer).
September 23, 2025 at 7:23 PM
This prompt injection for GEMINI CLI is just fascinating. It's crazy that because of LLM, hackers can now do social engineering without a human in the loop. But I guess giving execution rights to an LLM is already a big humain failure anyways. arstechnica.com/security/202...
July 30, 2025 at 2:50 PM
Finally most important, thanks to my awesome collaborators : Yanis Lalou, @tgnassou.bsky.social @antoinecollas.bsky.social , Antoine de Mathelin, @ambroiseodt.bsky.social , Thomas Moreau, Alexandre Gramfort and all the SKADA contributors 14/14 scikit-adaptation.github.io
July 29, 2025 at 12:54 PM
SKADA-bench is implemented using SKADA DA toolbox and Benchopt to ensure distributed run on slurm servers and 100% reproducibility. You can add a new DA methods or datasets with just a few lines of code. 11/n github.com/scikit-adapt...
July 29, 2025 at 12:54 PM
We also provide computational time for all methods (including with validation time) that shows that even fast methods can be very long to validate when they have multiple hyperparameters. 10/n
July 29, 2025 at 12:54 PM
We also provide in supplementary as study of the loss of performance due to realistic validation (DA Scorer) VS (unrealistic DA using target labels) and find that some methods can loose from 5 to 10% accuracy in this case. 9/n
July 29, 2025 at 12:54 PM
We also compared several Deep DA methods based on invariant features representations and show that they are as expected better than shallow on Vision DA but can fail on other modalities such as biomedical signals (and be worst than shallow methods). 8/n
July 29, 2025 at 12:54 PM
We also provide a comparison of different realistic DA scorers and plot their correlation wrt true target accuracy. Interestingly the best one across models and modalities is one of the oldest but rarely used in practice : Circular validation 6/n
July 29, 2025 at 12:54 PM
One nice result is that linear alignment methods (CORAL and Linear Optimal Transport) that have few parameters seem to work well on real data and are the only methods that do not decrease performance wrt no DA across modalities. 5/n
July 29, 2025 at 12:54 PM
Results on shallow DA show that methods designed for specific shifts work as expected on them on simulated data. But on real data adaptation performance (and if better than training on source data) is very dataset and method dependent. Many DA methods perform worst than no DA. 4/n
July 29, 2025 at 12:54 PM
The benchmark uses a nested cross-validation with realistic DA scorers that do not use target data labels (since they are unavailable in practice) to select the methods hyperparameters. This is a reality check for more complex DA approaches that are difficult to validate in practice. 3/n
July 29, 2025 at 12:54 PM
We compare 21 Shallow DA methods and 7 Deep DA methods on multiple modalities including simulated shifts, images, text, tabular and biomedical signals. For shallow methods we used pre-trained deep embeddings to evaluate their ability to adapt in the feature space. 2/n
July 29, 2025 at 12:54 PM
Edwige Cyffer presenting her work on the interaction between decentralization and privacy after receiving the 2025 best PhD prize from the French machine learning scientific society (SFFAM) at CAP 2025.
July 1, 2025 at 2:39 PM
Domain invariant representation uses deep learning representation learning to find a feature space that cancel domain specificity while keeping discriminability. 6/6
July 1, 2025 at 9:39 AM
Domain invariant subspace assumption supposes that there exists a subspace of the features where the shift between domain is canceled and labels conditionals are preserved. 5/6
July 1, 2025 at 9:39 AM
Conditional shift is more general and means that the label conditional change. To be compensated it requires a model for the change such as a mapping in the feature space that can be used to adapt the source data to target. Models of mapping can be affine to more complex optimal transport mappings.
July 1, 2025 at 9:39 AM
Target shift (or label shift) happens when marginal p(y) changes for instance proportion of classes. It can be compensated by re-weighting with the ration of p(y). 3/6
July 1, 2025 at 9:39 AM
Covariate shift happens when the feature marginals P(x) change across domain (but label conditional distributions are preserved). It ca, be compensated by re-weighting samples with the ratio of P(x) 2/6
July 1, 2025 at 9:39 AM
The most important aspect when facing data shift is the type of shift present in the data. I will give below a few examples of shifts and some existing methods to compensate for it.🧵1/6
July 1, 2025 at 9:39 AM
Time to take a vacation!
June 29, 2025 at 12:17 PM
DistR also allow original things such and doing PCA/tSNE on a fixed support where for instance the OT plan gives assignment of the samples to pixels on a regular grid. So if you have a prior about a fixed support in low dim. you can use it. 5/5
June 27, 2025 at 7:44 AM
By doing simultaneous clustering and dimensionality reduction, DistR can provide more interpretable visualization of the data (in euclidean or the Pointcarré ball) 4/5
June 27, 2025 at 7:44 AM
When n<N and d=D, a clustering is performed and the OT plan provides the assignments between data samples and clusters. When n=N and d<D then we recover classical dimensionality reduction methods (PCA, KPCA, tSNE). But the more general n<N and d<D leads to a novel simultaneous DR/Clustering 3/5
June 27, 2025 at 7:44 AM