Stegle Lab
banner
steglelab.bsky.social
Stegle Lab
@steglelab.bsky.social
Our group develops and applies computational approaches to study molecular variations and their phenotypic consequence. We are part of DKFZ and EMBL.
Website: https://steglelab.org/
Many thanks to all authors - Arber Qoku, Martin Rohbeck, Florin Walter, Ilia Kats, Florian Buettner - and everyone who supported this work. Funding from @erc.europa.eu , @denbi.bsky.social and @dfg.de.
@dkfz.bsky.social @embl.org @oliverstegle.bsky.social

📑 Preprint: doi.org/10.1101/2025...
13/13
MOFA-FLEX: A Factor Model Framework for Integrating Omics Data with Prior Knowledge
Latent factor models are first-line analysis approaches for single- and multi-omics data, essential for data integration, alignment, and biological signal discovery. To cater for new technologies and ...
doi.org
November 7, 2025 at 10:29 AM
You’ll find all the code and documentation here:
mofaflex.readthedocs.io/stable/

pip install mofaflex

… and you’re ready to go! 12/n
MOFA-FLEX
PyPI Tests codecov Documentation graphical abstract MOFA-FLEX is a versatile factor analysis framework designed to streamline the construction and training of complex matrix factorisation models fo...
mofaflex.readthedocs.io
November 7, 2025 at 10:29 AM
Curious to build your very first MOFA-FLEX model and test it on your own data?

Check out the tutorial thread by @mlolab.bsky.social to see how it’s done: bsky.app/profile/mlol... 11/n
Working with multi-omic or spatial data?

Meet MOFA-FLEX — a modular framework for interpretable factor analysis. Build custom models, select sparsity priors, leverage domain knowledge, integrate spatial information and explore results!

Here’s a quick tutorial 👇
November 7, 2025 at 10:29 AM
MOFA-FLEX is a unified framework that advances factor analysis for the omics community. Built on modular probabilistic programming, it lets researchers quickly design and deploy custom latent variable models by assembling reusable modules, fostering rapid innovation and extensibility. 10/n
November 7, 2025 at 10:29 AM
3️⃣ We applied MOFA-FLEX to breast cancer data combining Xenium (166K cells; ~300-gene spatial) and Chromium (30K cells; ~3K-gene non-spatial) datasets from adjacent FFPE slides, revealing latent factors capturing global cell-type variation and transcriptome-wide spatially resolved gene programs. 9/n
November 7, 2025 at 10:29 AM
2️⃣ When applying MOFA-FLEX to a multi-omic CITE-seq dataset from murine spleen and lymph nodes (N=14,870 cells), collected in two independent batches on different days, it was able to disentangle both batch and cell type variation. 8/n
November 7, 2025 at 10:29 AM
1️⃣ We applied MOFA-FLEX to an scRNA-seq dataset of PBMCs from lupus patients (N=13,576 cells) treated with or without IFN-β, showing it can identify IFN-β gene programs consistent with known biology and reveal new program-linked genes (e.g., TNFSF10). 7/n
November 7, 2025 at 10:29 AM
We showcase MOFA-FLEX in diverse applications: 1️⃣ robust recovery of gene programs from noisy prior knowledge in scRNA-seq, 2️⃣ disentangling technical and biological variation in multi-omic CITE-seq, and 3️⃣ spatial modelling revealing disease-linked gene programs in breast cancer. 6/n
November 7, 2025 at 10:29 AM
Additionally, MOFA-FLEX adds a domain knowledge module that links latent factors to gene programs. It seamlessly integrates resources like Hallmark, Reactome, or custom sets to guide model learning and enhance the interpretability of discovered biological factors. 5/n
November 7, 2025 at 10:29 AM
We teamed with @mlolab.bsky.social to develop MOFA-FLEX, addressing these challenges. Built on probabilistic programming, it unifies factor analysis extensions (flexible priors, non-negativity, supervision, alternative likelihoods), enabling declarative model design without manual engineering. 4/n
November 7, 2025 at 10:29 AM
However, the rapidly evolving technological and experimental landscape demands universal, user-friendly frameworks that can be tailored to specific needs, such as incorporating spatial structure, temporal dynamics, noisy single-cell data, and domain-specific knowledge. 3/n
November 7, 2025 at 10:29 AM
Advances in technology enabled parallel analysis of multiple biological layers. With the Buettner lab and others, we developed MOFA and related latent factor models to uncover key variation sources in multi-omics data. Today, factor models are first-line tools for sc and multi-omics analysis. 2/n
November 7, 2025 at 10:29 AM