Dongwook Kim
@dongwookkim.bsky.social
Developing fast and easy methods for #phylogenetics and #bioinformatics | PhD in Bioinformatics | Postdoc @ Comparative Genomics Lab, UNIL/SIB🇨🇭| Formerly @ Steinegger Lab, SNU🇰🇷 | he/him
Pinned
Dongwook Kim
@dongwookkim.bsky.social
· Jun 3
Unicore is now published on GBE 🚀
Unicore rapidly identifies structural single-copy core genes from input species proteomes for phylogenetic analysis. Powered by Foldseek and ProstT5, Unicore enables linear-scale structure-based phylogeny of any given set of taxa. 🧵1/n
📃 doi.org/10.1093/gbe/evaf109
Unicore rapidly identifies structural single-copy core genes from input species proteomes for phylogenetic analysis. Powered by Foldseek and ProstT5, Unicore enables linear-scale structure-based phylogeny of any given set of taxa. 🧵1/n
📃 doi.org/10.1093/gbe/evaf109
Reposted by Dongwook Kim
Stoked to finally have a preprint out for Phold, our tool that uses protein structural information to enhance phage genome annotation #phagesky 1/n
www.biorxiv.org/content/10.1...
www.biorxiv.org/content/10.1...
Protein Structure Informed Bacteriophage Genome Annotation with Phold
Bacteriophage (phage) genome annotation is essential for understanding their functional potential and suitability for use as therapeutic agents. Here we introduce Phold, an annotation framework utilis...
www.biorxiv.org
August 8, 2025 at 7:11 AM
Stoked to finally have a preprint out for Phold, our tool that uses protein structural information to enhance phage genome annotation #phagesky 1/n
www.biorxiv.org/content/10.1...
www.biorxiv.org/content/10.1...
Reposted by Dongwook Kim
Our new preprint is out!
www.biorxiv.org/content/10.1...
In this study, we present the largest systematic analysis of microbiome structure and function, integrating 85K uniformly processed metagenomes from diverse habitats worldwide.
@podlesny.bsky.social @jonas-bio.bsky.social @borklab.bsky.social
www.biorxiv.org/content/10.1...
In this study, we present the largest systematic analysis of microbiome structure and function, integrating 85K uniformly processed metagenomes from diverse habitats worldwide.
@podlesny.bsky.social @jonas-bio.bsky.social @borklab.bsky.social
Planetary microbiome structure and generalist-driven gene flow across disparate habitats
Microbes are ubiquitous on Earth, forming microbiomes that sustain macroscopic life and biogeochemical cycles. Microbial dispersion, driven by natural processes and human activities, interconnects mic...
www.biorxiv.org
July 21, 2025 at 11:56 AM
Our new preprint is out!
www.biorxiv.org/content/10.1...
In this study, we present the largest systematic analysis of microbiome structure and function, integrating 85K uniformly processed metagenomes from diverse habitats worldwide.
@podlesny.bsky.social @jonas-bio.bsky.social @borklab.bsky.social
www.biorxiv.org/content/10.1...
In this study, we present the largest systematic analysis of microbiome structure and function, integrating 85K uniformly processed metagenomes from diverse habitats worldwide.
@podlesny.bsky.social @jonas-bio.bsky.social @borklab.bsky.social
Reposted by Dongwook Kim
OrthoFinder just dropped a major update
It’s faster, more accurate, and ready for thousands of genomes
Let’s break it down (1/10)
github.com/OrthoFinder/...
www.biorxiv.org/content/10.1...
It’s faster, more accurate, and ready for thousands of genomes
Let’s break it down (1/10)
github.com/OrthoFinder/...
www.biorxiv.org/content/10.1...
July 16, 2025 at 5:51 PM
OrthoFinder just dropped a major update
It’s faster, more accurate, and ready for thousands of genomes
Let’s break it down (1/10)
github.com/OrthoFinder/...
www.biorxiv.org/content/10.1...
It’s faster, more accurate, and ready for thousands of genomes
Let’s break it down (1/10)
github.com/OrthoFinder/...
www.biorxiv.org/content/10.1...
Reposted by Dongwook Kim
Folddisco finds similar (dis)continuous 3D motifs in large protein structure databases. Its efficient index enables fast uncharacterized active site annotation, protein conformational state analysis and PPI interface comparison. 1/9🧶🧬
📄 www.biorxiv.org/content/10.1...
🌐 search.foldseek.com/folddisco
📄 www.biorxiv.org/content/10.1...
🌐 search.foldseek.com/folddisco
July 7, 2025 at 8:21 AM
Folddisco finds similar (dis)continuous 3D motifs in large protein structure databases. Its efficient index enables fast uncharacterized active site annotation, protein conformational state analysis and PPI interface comparison. 1/9🧶🧬
📄 www.biorxiv.org/content/10.1...
🌐 search.foldseek.com/folddisco
📄 www.biorxiv.org/content/10.1...
🌐 search.foldseek.com/folddisco
Reposted by Dongwook Kim
New paper from the lab from Sriram Garg in my group. We introduce a general substitution matrix for structural phylogenetics. I think this is a big deal, so read on below if you think deep history is important. academic.oup.com/mbe/advance-...
A general substitution matrix for structural phylogenetics.
Abstract. Sequence-based maximum likelihood (ML) phylogenetics is a widely used method for inferring evolutionary relationships, which has illuminated the
academic.oup.com
June 11, 2025 at 2:01 PM
New paper from the lab from Sriram Garg in my group. We introduce a general substitution matrix for structural phylogenetics. I think this is a big deal, so read on below if you think deep history is important. academic.oup.com/mbe/advance-...
Unicore is now published on GBE 🚀
Unicore rapidly identifies structural single-copy core genes from input species proteomes for phylogenetic analysis. Powered by Foldseek and ProstT5, Unicore enables linear-scale structure-based phylogeny of any given set of taxa. 🧵1/n
📃 doi.org/10.1093/gbe/evaf109
Unicore rapidly identifies structural single-copy core genes from input species proteomes for phylogenetic analysis. Powered by Foldseek and ProstT5, Unicore enables linear-scale structure-based phylogeny of any given set of taxa. 🧵1/n
📃 doi.org/10.1093/gbe/evaf109
June 3, 2025 at 6:55 AM
Unicore is now published on GBE 🚀
Unicore rapidly identifies structural single-copy core genes from input species proteomes for phylogenetic analysis. Powered by Foldseek and ProstT5, Unicore enables linear-scale structure-based phylogeny of any given set of taxa. 🧵1/n
📃 doi.org/10.1093/gbe/evaf109
Unicore rapidly identifies structural single-copy core genes from input species proteomes for phylogenetic analysis. Powered by Foldseek and ProstT5, Unicore enables linear-scale structure-based phylogeny of any given set of taxa. 🧵1/n
📃 doi.org/10.1093/gbe/evaf109
Reposted by Dongwook Kim
AFESM: a metagenomic guide through the protein structure universe! We clustered 821M structures (AFDB&ESMatlas) into 5.12M groups; revealing biome-specific groups, only 1 new fold even after AlphaFold2 re-prediction & many novel domain combos. 🧵
🌐 afesm.foldseek.com
📄 www.biorxiv.org/content/10.1...
🌐 afesm.foldseek.com
📄 www.biorxiv.org/content/10.1...
April 27, 2025 at 12:13 AM
AFESM: a metagenomic guide through the protein structure universe! We clustered 821M structures (AFDB&ESMatlas) into 5.12M groups; revealing biome-specific groups, only 1 new fold even after AlphaFold2 re-prediction & many novel domain combos. 🧵
🌐 afesm.foldseek.com
📄 www.biorxiv.org/content/10.1...
🌐 afesm.foldseek.com
📄 www.biorxiv.org/content/10.1...
Reposted by Dongwook Kim
Visit our posters at #RECOMB2025 for:
Structural: MSAs, Virus DB, Core Genes, Motif Discovery, Multimer Clustering & Search, pLM Foldseek, Environmental analysis
Metagenomics: Classification & Metabuli App
GPU-based & RNA search, Proteome clustering, Novel Ribozyme discovery
& get Marv stickers!
Structural: MSAs, Virus DB, Core Genes, Motif Discovery, Multimer Clustering & Search, pLM Foldseek, Environmental analysis
Metagenomics: Classification & Metabuli App
GPU-based & RNA search, Proteome clustering, Novel Ribozyme discovery
& get Marv stickers!
April 25, 2025 at 7:46 AM
Visit our posters at #RECOMB2025 for:
Structural: MSAs, Virus DB, Core Genes, Motif Discovery, Multimer Clustering & Search, pLM Foldseek, Environmental analysis
Metagenomics: Classification & Metabuli App
GPU-based & RNA search, Proteome clustering, Novel Ribozyme discovery
& get Marv stickers!
Structural: MSAs, Virus DB, Core Genes, Motif Discovery, Multimer Clustering & Search, pLM Foldseek, Environmental analysis
Metagenomics: Classification & Metabuli App
GPU-based & RNA search, Proteome clustering, Novel Ribozyme discovery
& get Marv stickers!
Reposted by Dongwook Kim
Not really my announcement to make--I am but a lesser co-author--but IQ-TREE 3 has just been released!
(Most credit to Minh Bui and @roblanfear.bsky.social and their labs)
ecoevorxiv.org/repository/v...
(Most credit to Minh Bui and @roblanfear.bsky.social and their labs)
ecoevorxiv.org/repository/v...
IQ-TREE 3: Phylogenomic Inference Software using Complex Evolutionary Models
ecoevorxiv.org
April 10, 2025 at 2:13 PM
Not really my announcement to make--I am but a lesser co-author--but IQ-TREE 3 has just been released!
(Most credit to Minh Bui and @roblanfear.bsky.social and their labs)
ecoevorxiv.org/repository/v...
(Most credit to Minh Bui and @roblanfear.bsky.social and their labs)
ecoevorxiv.org/repository/v...
Reposted by Dongwook Kim
🚀 #AlphaFold Database update
AlphaFold DB now integrates The Encyclopedia of Domains (TED) – a resource designed to systematically identify & classify structural domains within AlphaFold-predicted protein structures.
www.ebi.ac.uk/about/news/u...
@pdbeurope.bsky.social
AlphaFold DB now integrates The Encyclopedia of Domains (TED) – a resource designed to systematically identify & classify structural domains within AlphaFold-predicted protein structures.
www.ebi.ac.uk/about/news/u...
@pdbeurope.bsky.social
March 3, 2025 at 4:33 PM
🚀 #AlphaFold Database update
AlphaFold DB now integrates The Encyclopedia of Domains (TED) – a resource designed to systematically identify & classify structural domains within AlphaFold-predicted protein structures.
www.ebi.ac.uk/about/news/u...
@pdbeurope.bsky.social
AlphaFold DB now integrates The Encyclopedia of Domains (TED) – a resource designed to systematically identify & classify structural domains within AlphaFold-predicted protein structures.
www.ebi.ac.uk/about/news/u...
@pdbeurope.bsky.social
Reposted by Dongwook Kim
The PAN-GO paper is a remarkable milestone. It not only provides the most comprehensive picture of human gene function to date, but also carefully maps this knowledge across the tree of life! Congratulations @marcfeuermann.bsky.social, Pascale Gaudet & collaborators!
www.sib.swiss/news/sib-hel...
www.sib.swiss/news/sib-hel...
February 26, 2025 at 10:37 PM
The PAN-GO paper is a remarkable milestone. It not only provides the most comprehensive picture of human gene function to date, but also carefully maps this knowledge across the tree of life! Congratulations @marcfeuermann.bsky.social, Pascale Gaudet & collaborators!
www.sib.swiss/news/sib-hel...
www.sib.swiss/news/sib-hel...
Reposted by Dongwook Kim
In our latest review, we explore 12 deep-learning tools for metagenomic analysis, covering their strengths, limitations, and key applications. We hope it serves as both a resource and inspiration for new ways to analyze metagenomic data. Great work by Eli Levy Karin!
📄 doi.org/10.1093/nsr/...
📄 doi.org/10.1093/nsr/...
February 22, 2025 at 5:47 AM
In our latest review, we explore 12 deep-learning tools for metagenomic analysis, covering their strengths, limitations, and key applications. We hope it serves as both a resource and inspiration for new ways to analyze metagenomic data. Great work by Eli Levy Karin!
📄 doi.org/10.1093/nsr/...
📄 doi.org/10.1093/nsr/...
Reposted by Dongwook Kim
FastOMA is out now in Nature Methods 🎉: nature.com/articles/s41592-024-02552-8 A new orthology inference algorithm that scales linearly and is highly accurate. FastOMA can process all >2000 eukaryotic UniProt ref proteomes <24 hours 🚀. Try it out github.com/DessimozLab/fastoma @dessimoz.bsky.social
January 3, 2025 at 2:14 PM
FastOMA is out now in Nature Methods 🎉: nature.com/articles/s41592-024-02552-8 A new orthology inference algorithm that scales linearly and is highly accurate. FastOMA can process all >2000 eukaryotic UniProt ref proteomes <24 hours 🚀. Try it out github.com/DessimozLab/fastoma @dessimoz.bsky.social
Reposted by Dongwook Kim
Unicore identifies single-copy protein structures across genomes using Foldseek, bypassing slow structure predictions by utilizing 3Di predictions from ProstT5, enabling rapid phylogenetic inference at the tree-of-life scale. 1/n
📄 www.biorxiv.org/content/10.1...
💾 github.com/steineggerla...
📄 www.biorxiv.org/content/10.1...
💾 github.com/steineggerla...
December 23, 2024 at 4:39 PM
Unicore identifies single-copy protein structures across genomes using Foldseek, bypassing slow structure predictions by utilizing 3Di predictions from ProstT5, enabling rapid phylogenetic inference at the tree-of-life scale. 1/n
📄 www.biorxiv.org/content/10.1...
💾 github.com/steineggerla...
📄 www.biorxiv.org/content/10.1...
💾 github.com/steineggerla...
Reposted by Dongwook Kim
Unicore enables scalable and accurate phylogenetic reconstruction with structural core genes https://www.biorxiv.org/content/10.1101/2024.12.22.629535v1
December 23, 2024 at 3:51 AM
Unicore enables scalable and accurate phylogenetic reconstruction with structural core genes https://www.biorxiv.org/content/10.1101/2024.12.22.629535v1
Reposted by Dongwook Kim
Scientists, academics, researchers: We’re excited to share that @altmetric.com is now tracking mentions of your research on Bluesky! 🧪
There are already many articles for which there is more attention on Bluesky than on other comparable micro-blogging sites, meaning the academic community and the general public have clearly adopted Bluesky as one of its core places to disseminate and discuss new research.
A Place of Joy.
A Place of Joy.
December 3, 2024 at 2:10 PM
Scientists, academics, researchers: We’re excited to share that @altmetric.com is now tracking mentions of your research on Bluesky! 🧪
Reposted by Dongwook Kim
South Korean citizens helped lawmakers scale the National Assembly walls so they could bypass military barricades and vote against martial law.
December 3, 2024 at 5:15 PM
South Korean citizens helped lawmakers scale the National Assembly walls so they could bypass military barricades and vote against martial law.
Reposted by Dongwook Kim
Reminder for newcomers that bioRxiv has Bluesky accounts in every subject category - great way to keep up (please re-skeet) connect.biorxiv.org/news/2023/09...
bioRxiv expands on Mastodon and Bluesky
bioRxiv - the preprint server for biology, operated by Cold Spring Harbor Laboratory, a research and educational institution
connect.biorxiv.org
November 10, 2024 at 1:40 PM
Reminder for newcomers that bioRxiv has Bluesky accounts in every subject category - great way to keep up (please re-skeet) connect.biorxiv.org/news/2023/09...
Reposted by Dongwook Kim
Interested in bioinformatics method development for proteins, structures or metagenomic analysis? Please check out my lab’s starter pack!
🔗 go.bsky.app/VJhXcSs
🔗 go.bsky.app/VJhXcSs
November 28, 2024 at 12:36 PM
Interested in bioinformatics method development for proteins, structures or metagenomic analysis? Please check out my lab’s starter pack!
🔗 go.bsky.app/VJhXcSs
🔗 go.bsky.app/VJhXcSs
Reposted by Dongwook Kim
MMseqs2 Release 16 Highlights: GPU-accelerated search📄, ORF or new 6-frame translated search modes, contig taxonomy always keeps the longest ORF, bug fixes (reduced memory and higher sensitivity) and relicensed as MIT
📄 biorxiv.org/content/10.1...
💾 mmseqs.com and 🐍Bioconda 🖥️🧬🧶
📄 biorxiv.org/content/10.1...
💾 mmseqs.com and 🐍Bioconda 🖥️🧬🧶
November 27, 2024 at 9:08 AM
MMseqs2 Release 16 Highlights: GPU-accelerated search📄, ORF or new 6-frame translated search modes, contig taxonomy always keeps the longest ORF, bug fixes (reduced memory and higher sensitivity) and relicensed as MIT
📄 biorxiv.org/content/10.1...
💾 mmseqs.com and 🐍Bioconda 🖥️🧬🧶
📄 biorxiv.org/content/10.1...
💾 mmseqs.com and 🐍Bioconda 🖥️🧬🧶
Reposted by Dongwook Kim
What did the Last Eukaryotic Common Ancestor (#LECA) look like? Consensus View in #PLOSBiology; massive authorship including @AncestralState, @lauraeme.bsky.social, John Archbald, @andrewjroger.bsky.social, @dackslabecb.bsky.social, Jeremy Wideman. plos.io/4g0alq4
November 25, 2024 at 7:29 PM
What did the Last Eukaryotic Common Ancestor (#LECA) look like? Consensus View in #PLOSBiology; massive authorship including @AncestralState, @lauraeme.bsky.social, John Archbald, @andrewjroger.bsky.social, @dackslabecb.bsky.social, Jeremy Wideman. plos.io/4g0alq4
Reposted by Dongwook Kim
Our Big Fantastic Virus Database (BFVD) is now published NAR! It contains protein structure predictions of major viral clades, enhanced by petabase-scale homology search and it's explorable on the web.
🌐 bfvd.foldseek.com
💾 bfvd.steineggerlab.workers.dev
📄 academic.oup.com/nar/advance-...
🌐 bfvd.foldseek.com
💾 bfvd.steineggerlab.workers.dev
📄 academic.oup.com/nar/advance-...
November 23, 2024 at 9:12 PM
Our Big Fantastic Virus Database (BFVD) is now published NAR! It contains protein structure predictions of major viral clades, enhanced by petabase-scale homology search and it's explorable on the web.
🌐 bfvd.foldseek.com
💾 bfvd.steineggerlab.workers.dev
📄 academic.oup.com/nar/advance-...
🌐 bfvd.foldseek.com
💾 bfvd.steineggerlab.workers.dev
📄 academic.oup.com/nar/advance-...