Miquel Anglada-Girotto
banner
m1quelag.bsky.social
Miquel Anglada-Girotto
@m1quelag.bsky.social
Love predicting genomic things. Postdoc @crgenomica.bsky.social at the Probabilistic Machine Learning and Genomics group.

Creator of @splicingnews.bsky.social
Wouldn’t it be cool to leverage the throughput of single-cell data to study splicing regulation even when we lack exon resolution? 😀

Here’s the peer-reviewed version of our paper on how we can measure changes in splicing factor activity in virtually any single-cell dataset: doi.org/10.1093/nar/...
Using single-cell perturbation screens to decode the regulatory architecture of splicing factor programs
Abstract. Splicing factors shape the isoform pool of most transcribed genes, playing a critical role in cellular physiology. Their dysregulation is a hallm
doi.org
October 25, 2025 at 2:16 PM
Couldn't think of a better place to make models! Come join us!
Applications are open for the @crg_eu PhD Programme! 20 fully funded positions — including one in our group through the Evolutionary Medical Genomics ITN.

Join us to develop deep generative models of cross-species data to tackle open questions in disease genetics.

www.crg.eu/en/content/t...
October 23, 2025 at 12:11 PM
Es Castell
July 15, 2025 at 7:21 PM
Reposted by Miquel Anglada-Girotto
An organoid model of the menstrual cycle reveals the role of the luminal epithelium in regeneration of the human endometrium https://www.biorxiv.org/content/10.1101/2025.07.03.663000v1
July 7, 2025 at 10:30 AM
Reposted by Miquel Anglada-Girotto
I am very happy to have posted my first bioRxiv preprint. A long time in the making - and still adding a few final touches to it - but we're excited to finally have it out there in the wild:
www.biorxiv.org/content/10.1...
Read below for a few highlights...
Decoding cnidarian cell type gene regulation
Animal cell types are defined by differential access to genomic information, a process orchestrated by the combinatorial activity of transcription factors that bind to cis -regulatory elements (CREs) to control gene expression. However, the regulatory logic and specific gene networks that define cell identities remain poorly resolved across the animal tree of life. As early-branching metazoans, cnidarians can offer insights into the early evolution of cell type-specific genome regulation. Here, we profiled chromatin accessibility in 60,000 cells from whole adults and gastrula-stage embryos of the sea anemone Nematostella vectensis. We identified 112,728 CREs and quantified their activity across cell types, revealing pervasive combinatorial enhancer usage and distinct promoter architectures. To decode the underlying regulatory grammar, we trained sequence-based models predicting CRE accessibility and used these models to infer ontogenetic relationships among cell types. By integrating sequence motifs, transcription factor expression, and CRE accessibility, we systematically reconstructed the gene regulatory networks that define cnidarian cell types. Our results reveal the regulatory complexity underlying cell differentiation in a morphologically simple animal and highlight conserved principles in animal gene regulation. This work provides a foundation for comparative regulatory genomics to understand the evolutionary emergence of animal cell type diversity. ### Competing Interest Statement The authors have declared no competing interest. European Research Council, https://ror.org/0472cxd90, ERC-StG 851647 Ministerio de Ciencia e Innovación, https://ror.org/05r0vyz12, PID2021-124757NB-I00, FPI Severo Ochoa PhD fellowship European Union, https://ror.org/019w4f821, Marie Skłodowska-Curie INTREPiD co-fund agreement 75442, Marie Skłodowska-Curie grant agreement 101031767
www.biorxiv.org
July 6, 2025 at 6:15 PM
Reposted by Miquel Anglada-Girotto
Last week I released bpnet-lite v0.5.0.

BPNet/ChromBPNet are powerful models for understanding regulatory genomics from @anshulkundaje.bsky.social's group, and now it's way easier to go from raw data to trained models and analysis + results in PyTorch

Try it out with `pip install bpnet-lite`
June 18, 2025 at 9:48 AM
Reposted by Miquel Anglada-Girotto
I wrote a quick application note on Tomtom-lite, a Python implementation of the Tomtom algorithm for comparing PWMs against each other. This implementation can be 10-1000x faster and, as a Python function, can be integrated into your workflows easier.

www.biorxiv.org/content/10.1...
Tomtom-lite: Accelerating Tomtom enables large-scale and real-time motif similarity scoring
Summary Pairwise sequence similarity is a core operation in genomic analysis, yet most attention has been given to sequences made up of discrete characters. With the growing prevalence of machine lear...
www.biorxiv.org
June 3, 2025 at 6:02 PM
Reposted by Miquel Anglada-Girotto
Polygenic scores (PGS) offer insights into a person’s inherited risk of disease.

GeneticScores.org is a new platform that enables secure, cloud-based calculation of polygenic scores to make genomic risk prediction more accessible.

www.ebi.ac.uk/about/news/u...

🖥️🧬
June 9, 2025 at 11:21 AM
Today I learned artists study primitive art to understand how art was made out of the art business context.

This made me wonder how science would be made nowadays out of the journal publishing context. Would we try to answer different questions?
June 7, 2025 at 8:53 PM
Leveraging evolution to make fitness estimation scale with model size again! Great experiencing the making of this one behind the scenes 🙌
May 26, 2025 at 8:31 PM
Reposted by Miquel Anglada-Girotto
polars-bio - fast, scalable and out-of-core operations on large genomic interval datasets www.biorxiv.org/content/10.1... 🧬🖥️🧪 github.com/biodatageeks...
March 26, 2025 at 9:37 AM
How can we leverage Perturb-seq screens to study splicing factor (SF) regulation systematically?

Here’s our approach: bsky.app/profile/bior...
Using single-cell perturbation screens to decode the regulatory architecture of splicing factor programs https://www.biorxiv.org/content/10.1101/2025.02.07.637061v1
February 21, 2025 at 10:38 PM
Hi all! Inspired by how easy ColabFold ( @sokrypton.org ) made prot structure prediction for me, I have started ColabRNA to facilitate making predictions with RNA-based models!

Currently, the following models are available:
- SpliceAI
- Pangolin
- SpliceTransformer
- Borzoi

Happy to get feedback!
GitHub - MiqG/ColabRNA: Making RNA-based models accessible to all.
Making RNA-based models accessible to all. Contribute to MiqG/ColabRNA development by creating an account on GitHub.
github.com
February 15, 2025 at 10:31 PM
Reposted by Miquel Anglada-Girotto
The Genomic Code: the genome instantiates a generative model of the organism www.cell.com/trends/genet... - really delighted to see this in print in @cp-trendsgenetics.bsky.social! 😊
The Genomic Code: the genome instantiates a generative model of the organism
How does the genome encode the form of the organism? What is the nature of this genomic code? Inspired by recent work in machine learning and neuroscience, we propose that the genome encodes a generat...
www.cell.com
February 11, 2025 at 11:46 AM
Reposted by Miquel Anglada-Girotto
Theo Wolf's terrific writeup of the groundbreaking paper introducing Kolmogorov-Arnold Networks.
Kolmogorov-Arnold Networks: the latest advance in Neural Networks, simply explained
The new type of network that is making waves in the ML world.
towardsdatascience.com
February 12, 2025 at 6:52 PM
Reposted by Miquel Anglada-Girotto
Cool paper using LLM to discover a protein sequence code for subcellular localization 👏

www.science.org/doi/10.1126/...
Protein codes promote selective subcellular compartmentalization
Cells have evolved mechanisms to distribute ~10 billion protein molecules to subcellular compartments where diverse proteins involved in shared functions must assemble. Here, we demonstrate that prote...
www.science.org
February 9, 2025 at 12:46 AM
Reposted by Miquel Anglada-Girotto
I wanted to write briefly about a very pleasant experience we recently had coordinating and collaborating closely on competing publications with 2 other teams. 1/
January 24, 2025 at 7:36 PM
Reposted by Miquel Anglada-Girotto
Build common reference indexes with Nextflow @nf-co.re nf-core/references https://github.com/nf-core/references 🧬🖥️🧪
December 30, 2024 at 12:06 PM
Reposted by Miquel Anglada-Girotto
A universal tool for chromatin loop annotation in bulk and single-cell Hi-C data [new]
Analyzes 3D genome data with a U-shaped network & axial attention to identify loops/structures. Utilizes pretraining, for universal detection.
December 24, 2024 at 7:48 PM
Reposted by Miquel Anglada-Girotto
MethylQUEEN: A Methylation Encoded DNA Foundation Model [new]
Novel model, MethylQUEEN, learns methylation states from DNA using a transformer, inferring tissue origin, gene expression and key regulatory sites.
December 26, 2024 at 7:52 PM
Nice benchmark!
How suitable are clustering methods for functional annotation of proteins? [new]
PAAC clustering: agglo,k-means,GMM best; spectral poor. Some annot.
December 29, 2024 at 9:38 AM
Ciutadella, 12/2024
December 24, 2024 at 5:36 PM
Reposted by Miquel Anglada-Girotto
December 17, 2024 at 2:36 PM
My new favorite episode!
🔥Today's new episode of the Night Science Podcast is super cool: A hypothesis is a liability! We talk about the interplay between hypothesis-driven and exploratory research, and discuss the insights of previous guests of the podcast. I'm really curious to know what you'll think!
December 16, 2024 at 8:25 PM