Milot Mirdita
milot.bsky.social
Milot Mirdita
@milot.bsky.social
Open source #bioinformatics at Sungkyunkwan University 🇰🇷 | former Steinegger Lab @ SNU, Söding Lab @ MPI-NAT | http://mstdn.science/@milotmirdita
Reposted by Milot Mirdita
Introducing The Structural History of Eukarya (SHE): The first proteome-scale phylogeny constructed entirely from 3D structure.
We computed 300 trillion alignments across 1,542 species to map the tree of life. 🧵👇 (1/5)
February 7, 2026 at 8:50 AM
Reposted by Milot Mirdita
Please spread the word:

We invite applications to a two-week Computational Biology workshop in Singapore, June 14-27.

This NSF-funded workshop brings together 16-20 US grad students with international peers.
Apply by March 21: compbioasia.net
🧵 Details below:
Compbio Asia
compbioasia.net
February 5, 2026 at 5:22 PM
Reposted by Milot Mirdita
Distance-Restraint-Guided Diffusion Models for Sampling Protein Conformational Changes and Ligand Dissociation Pathways
Tatsuki Hori, Yoshitaka Moriwaki, Ryuichiro Ishitani
www.biorxiv.org/content/10.6...
Our new preprint is out.
www.biorxiv.org
February 2, 2026 at 7:52 AM
Reposted by Milot Mirdita
FoldMason is out now in @science.org. It generates accurate multiple structure alignments for thousands of protein structures in seconds. Great work by Cameron L. M. Gilchrist and @milot.bsky.social.
📄 www.science.org/doi/10.1126/...
🌐 search.foldseek.com/foldmason
💾 github.com/steineggerla...
Multiple protein structure alignment at scale with FoldMason
Protein structure is conserved beyond sequence, making multiple structural alignment (MSTA) essential for analyzing distantly related proteins. Computational prediction methods have vastly extended ou...
www.science.org
January 30, 2026 at 6:11 AM
Reposted by Milot Mirdita
Can ever-increasing sequence databases improve phylogenetic reconstruction of a gene family? Our new preprint introduces AmpliPhy, a pipeline that automates homolog enrichment to improve gene tree inference, built on a robust phylogenomic benchmark scheme. 🧵1/n
📃 doi.org/10.64898/2026.01.26.701724
AmpliPhy improves gene trees by adding homologs without affecting alignments
In phylogenomics, gene tree reconstruction depends on multiple sequence alignment (MSA) and tree inference, and ongoing work continues to improve inference quality. Denser taxon sampling has been associated with improved gene tree inference, suggesting that adding homologs could be a practical route to higher accuracy as sequence databases continue to expand. However, adding sequences can influence multiple steps of typical inference pipelines, and little is known on its specific effect on the multiple sequence alignment, tree reconstruction, and rooting steps. We performed a large-scale empirical benchmark to quantify how homolog enrichment affects alignment and phylogenetic inference. Using an enrichment-impoverishment design and a measure of tree accuracy based on taxonomic congruence, we found that enrichment consistently improves tree inference quality, while effects on alignment quality are marginal. We show that this improvement is associated with accurate root placement on enriched trees when sensitive homolog search is accompanied. Notably, much of the benefit can be retained with relatively compact alignments produced by sequence addition. Building on these observations, we provide a tool, AmpliPhy, which efficiently improves phylogenetic reconstruction of protein families through homolog enrichment. The AmpliPhy open-source pipeline software is available at https://github.com/DessimozLab/ampliphy. ### Competing Interest Statement The authors have declared no competing interest. Swiss National Science Foundation, https://ror.org/00yjd3n13, 216623, 10005715
doi.org
January 28, 2026 at 6:10 AM
Reposted by Milot Mirdita
Milot’s venture into establishing his own lab is incredibly excitinge. I highly recommend to join Milot on his mission to advance molecular biology through open-source bioinformatics.
My time in @martinsteinegger.bsky.social's group is ending, but I’m staying in Korea to build a lab at Sungkyunkwan University School of Medicine. If you or someone you know is interested in molecular machine learning and open-source bioinformatics, please reach out. I am hiring!
mirdita.org
Mirdita Lab - Laboratory for Computational Biology & Molecular Machine Learning
Mirdita Lab builds scalable bioinformatics methods.
mirdita.org
January 21, 2026 at 3:37 AM
My time in @martinsteinegger.bsky.social's group is ending, but I’m staying in Korea to build a lab at Sungkyunkwan University School of Medicine. If you or someone you know is interested in molecular machine learning and open-source bioinformatics, please reach out. I am hiring!
mirdita.org
Mirdita Lab - Laboratory for Computational Biology & Molecular Machine Learning
Mirdita Lab builds scalable bioinformatics methods.
mirdita.org
January 20, 2026 at 11:07 AM
Reposted by Milot Mirdita
This is very sad news

'It is with great sadness that EMBL announces that Interim Director General Professor Peer Bork passed away from natural causes on 16 January 2026.'

www.embl.org/news/embl-an...
In remembrance of Peer Bork  | EMBL
EMBL and its community are deeply saddened by the death of Peer Bork, the organisation’s Interim Director General.
www.embl.org
January 16, 2026 at 6:06 PM
Reposted by Milot Mirdita
Happy to share that our work on HLp, a bacterial histone from Leptospira perolatii, is now published in Nature Communications 🎉

In this study, we show that HLp forms stable tetramers that wrap ~60 bp of DNA, revealing a distinct histone–DNA organization in bacteria.

www.nature.com/articles/s41...
December 13, 2025 at 8:09 AM
Reposted by Milot Mirdita
From Sameer Velankar & colleagues in @narjournal.bsky.social #NARDatabaseIssue | PDBe: enhanced structural data exploration to facilitate discovery | #Bioinformatics #Database #OpenScience #Proteomics #PDB 🧬 🖥️🧪🔓
⬇️
academic.oup.com/nar/advance-...
PDBe: enhanced structural data exploration to facilitate discovery
Abstract. Protein Data Bank in Europe (PDBe) is a founding member of the worldwide Protein Data Bank (wwPDB), delivering open access to experimentally dete
academic.oup.com
December 11, 2025 at 3:13 PM
Reposted by Milot Mirdita
Today marks one year since the Dec. 3, 2024 martial law declaration that rocked South Korea and still reverberates today. What’s on my mind today is the grit of South Koreans who rushed to the National Assembly that night, in freezing weather, to demand a return to democratic government.
December 3, 2025 at 3:09 AM
Reposted by Milot Mirdita
We are deeply saddened to learn of the passing of Amos Bairoch. His vision and leadership helped build the foundations of today’s bioinformatics community. From the creation of essential biological databases to decades of mentorship, his influence can be felt across research groups worldwide.
December 2, 2025 at 5:00 PM
Reposted by Milot Mirdita
LoL-align: sensitive and fast probabilistic protein structure alignment https://www.biorxiv.org/content/10.1101/2025.11.24.690091v1
November 26, 2025 at 2:46 AM
Reposted by Milot Mirdita
A few py2Dmol updates 🧬

py2dmol.solab.org
Integration with AlphaFoldDB (will auto fetch results). Drag and drop results from AF3-server or ColabFold for interactive experience! (1/4)
November 19, 2025 at 8:15 AM
Reposted by Milot Mirdita
Guess the news is officially out! Extremely excited to announce that I will be starting my own laboratory at Institut Pasteur @pasteur.fr this coming spring!

Slight change to my office window view from Tokyo Tower🗼 to the Tour Eiffel. 🇫🇷
November 15, 2025 at 6:42 AM
Reposted by Milot Mirdita
I want to spell this out in case the implications aren't clear:

This means all public tools/webapps of GISAID data (all the ones you've been used to seeing thru the pandemic, as far as we can tell) are prohibited.

The file allowed this. Cut that - cut off all tools the public & others were using.
On Oct 1, 2025, GISAID informed us that they had ended updates to the flat file of SARS-CoV-2 genomic sequences and associated metadata that we had used to update Nextstrain analyses since Feb 2020. GISAID's stated rationale was that their "resources are limited". 1/5
November 7, 2025 at 2:41 PM
Reposted by Milot Mirdita
OpenFold3-preview (OF3p) is out: a sneak peek of our AF3-based structure prediction model. Our aim for OF3 is full AF3-parity for every modality. We now believe we have a clear path towards this goal and are releasing OF3p to enable building in the OF3 ecosystem. More👇
October 28, 2025 at 6:30 PM
Reposted by Milot Mirdita
DIAMOND v2.1.15 now supports all taxonomy features for BLAST databases, and support for using BLAST databases has also been added to the Bioconda version github.com/bbuchfink/di...
GitHub - bbuchfink/diamond: Accelerated BLAST compatible local sequence aligner.
Accelerated BLAST compatible local sequence aligner. - bbuchfink/diamond
github.com
October 28, 2025 at 4:45 PM
Reposted by Milot Mirdita
Our new preprint is out. Our group performed a comprehensive protein–protein complex prediction within 2,437 biosynthetic gene clusters. We predicted a total of 487,828 complexes for known BGCs, identifying 15,438 heteromeric interactions with an ipTM ≥ 0.6. (2/3)
www.biorxiv.org/content/10.1...
Predicting protein complexes in biosynthetic gene clusters
Biosynthetic gene clusters (BGCs) are contiguous genomic regions that encode diverse, non-homologous proteins required for the production of specific natural products. Their genetic diversity underlie...
www.biorxiv.org
October 28, 2025 at 5:58 AM
Reposted by Milot Mirdita
Working on the protein-hunter-chai google colab notebook. 😈

@yehlincho.bsky.social
October 28, 2025 at 3:34 AM
Reposted by Milot Mirdita
Excited to release BoltzGen which brings SOTA folding performance to binder design! The best part of this project is collaborating with a broad network of leading wetlabs that test BoltzGen at an unprecedented scale, showing success on many novel targets and pushing the model to its limits!
October 26, 2025 at 10:40 PM
Reposted by Milot Mirdita
We train machine learning models on millions of proteins. But when it comes to making predictions, do we need them to understand all proteins at once? Often, we need an accurate model for the specific protein we are studying or designing. We address this with ProteinTTT arxiv.org/abs/2411.02109 1/🧵
October 23, 2025 at 1:08 PM
Reposted by Milot Mirdita
End-to-end protein design in the browser through evedesign. Generate and interactively explore designs in 2D/3D and export them as codon-optimized DNA. The underlying open source framework (released soon) is build to easily add new methods, more on that soon.
🌐 evedesign.bio
October 22, 2025 at 2:30 PM