Jim Shaw
jimshaw.bsky.social
Jim Shaw
@jimshaw.bsky.social
Postdoc at Dana-Farber and Harvard Med with Heng Li (@lh3lh3.bsky.social). Prev: UBC / UofT.

I like thinking about biological sequence analysis and its applications to metagenomics / microbial genomics.

https://jim-shaw-bluenote.github.io
Reposted by Jim Shaw
New blog post with some thoughts on @nanoporetech.com and their recent announcement that the P2 Solo will be discontinued:
rrwick.github.io/2026/01/21/p...
P2 Solo announcement and the trade-offs of a more stable ONT
a blog for miscellaneous bioinformatics stuff
rrwick.github.io
January 21, 2026 at 3:38 AM
Reposted by Jim Shaw
We just released #anvio v9, "eunice" 🎉

This version represents over 2,000 changes in the codebase since v8, increasing the total number of programs in the anvi'o ecosystem to 176.

Read the release notes:

github.com/merenlab/anv...

Visit our up-to-date web page:

anvio.org
January 20, 2026 at 11:48 AM
Reposted by Jim Shaw
My time in @martinsteinegger.bsky.social's group is ending, but I’m staying in Korea to build a lab at Sungkyunkwan University School of Medicine. If you or someone you know is interested in molecular machine learning and open-source bioinformatics, please reach out. I am hiring!
mirdita.org
Mirdita Lab - Laboratory for Computational Biology & Molecular Machine Learning
Mirdita Lab builds scalable bioinformatics methods.
mirdita.org
January 20, 2026 at 11:07 AM
Reposted by Jim Shaw
I am looking for a postdoc to develop high-performance algorithms in computational genomics. Email or DM me if interested. For more information, see hlilab.github.io/vacancies. RTs appreciated!
HLi Lab - Vacancies
Openings
hlilab.github.io
January 14, 2026 at 3:44 PM
Reposted by Jim Shaw
Now published in Algorithms for Molecular Biology: link.springer.com/article/10.1.... Key message: a tiny CNN model with 7k parameters can capture main splice signals across vertebrates+insect and halves the minimap2 & miniprot junction error rate. I always use this new feature now.
Preprint on "Improving spliced alignment by modeling splice sites with deep learning". It describes minisplice for modeling splice signals. Minimap2 and miniprot now optionally use the predicted scores to improve spliced alignment.
arxiv.org/abs/2506.12986
January 6, 2026 at 11:02 PM
Reposted by Jim Shaw
🎉 New year, NEW PREPRINT!

Bacteria exhibit astonishing genetic diversity, but where do new genes come from?

My best friend Arya Kaul (/labmate in the @baym lab) investigates how advantageous deletions can spawn new genes - "deletion-born fusions." 🧵:
Novel genes arise from genomic deletions across the bacterial tree of life https://www.biorxiv.org/content/10.64898/2026.01.05.697752v1
January 6, 2026 at 4:09 PM
Reposted by Jim Shaw
Proud to announce SimPhyNI, a new tool for bacterial GWAS with higher precision and scalability than existing tools. Try it out and let us know what you think!!
January 5, 2026 at 2:55 PM
Reposted by Jim Shaw
Grateful to share our paper on gene-specific selective sweeps in human gut microbiomes, now out in Nature! It has been a joy to work with @rwolff.bsky.social, whose insights and hard work made this possible.
www.nature.com/articles/s41...
Gene-specific selective sweeps are pervasive across human gut microbiomes - Nature
Development and application of the integrated linkage disequilibrium score (iLDS) reveals both selective pressures impacting the human gut microbiome and the mechanisms by which gut bacteria adapt to ...
www.nature.com
December 17, 2025 at 6:53 PM
Reposted by Jim Shaw
The scikit-bio paper in online in Nature Methods! Many thanks to our collaborators, community contributors and reviewers! We couldn’t have done it without you. www.nature.com/articles/s41... #Bioinformatics #OpenSource
Scikit-bio: a fundamental Python library for biological omic data analysis - Nature Methods
Nature Methods - Scikit-bio: a fundamental Python library for biological omic data analysis
www.nature.com
December 11, 2025 at 5:57 PM
Reposted by Jim Shaw
The GTDB website now has an ANI calculator based on skani that supports uploading of user genomes. Try it at gtdb.ecogenomic.org/tools/skani.

Find more information about @jimshaw.bsky.social fantastic tool at www.nature.com/articles/s41....
GTDB - skani calculator
An interface to compute pairwise ANI of NCBI genomes using the GTDB taxonomy.
gtdb.ecogenomic.org
December 11, 2025 at 2:59 PM
Reposted by Jim Shaw
I’m recruiting a postdoc to work on algorithms for cancer genome reconstruction. We have access to a rich set of tumour samples sequenced across multiple technologies. If interested, feel free to DM. Please share.
December 11, 2025 at 3:04 AM
Reposted by Jim Shaw
One flowcell from @nanoporetech.com yielded 260 Gbp 🎉🚀🤯🟩
December 8, 2025 at 4:04 PM
Reposted by Jim Shaw
100%, both ONT and PacBio (although most of what we do is not marine / streamlined genome). We just published a specific study of soil metag short- vs long-read, and we see that, among other things, long-reads assemble regions too complex for short reads academic.oup.com/nargab/artic...
Comparison of short-read and long-read metagenome assemblies in a natural soil community highlights systematic bias in recovery of high-diversity populations
Abstract. Comparisons of long-read and short-read (meta)genome assemblies typically show that short-read sequence assemblies are less error-prone, but stru
academic.oup.com
December 8, 2025 at 4:25 PM
Reposted by Jim Shaw
Happy to share our new AMR resource which has phenotypic AMR (usually MIC data) collected from publications and databases. This is paired with assemblies and annotations

We're excited for users who might train new models, find phenotype/genotype mismatches, or any other use
Antimicrobial resistance (AMR) is a growing health threat, making infections harder to treat and complicating routine medical care.

EMBL-EBI’s new AMR portal brings together laboratory resistance data and bacterial genomes in one open platform.

#WAAW2025 #ActOnAMR

www.ebi.ac.uk/about/news/t...
🧬💻
A new gateway to global antimicrobial resistance data
New online portal connects bacterial genomes with experimental resistance data to support antimicrobial resistance research.
www.ebi.ac.uk
November 19, 2025 at 12:27 PM
Reposted by Jim Shaw
Preprint out! Check out our new long-read metagenomic SNP-caller, SNooPy 😀. Work with Chris Quince. Thread 🧵
👉 www.biorxiv.org/content/10.6...
December 4, 2025 at 1:18 PM
Reposted by Jim Shaw
Microflora Danica: What can you learn from collecting and sequencing 10,000+ samples from a single country? Check out our new paper in @nature.com to find out. Incredible work led by Caitlin Singleton, Thomas B. N. Jensen, and Mads Albertsen from @aau.dk. 🦠🧫🧬
www.nature.com/articles/s41...
The Microflora Danica atlas of Danish environmental microbiomes - Nature
Microflora Danica—an atlas of Danish environmental microbiomes—reveals that although human-disturbed habitats have high alpha diversity, species reoccur, revealing hidden homogeneity.
www.nature.com
December 3, 2025 at 8:50 PM
Reposted by Jim Shaw
579 high-quality human genomes from @humanpangenome.bsky.social, Arab Pangenome and individual papers (CHM13, CN1, KSA001, I002C, YAO and KOREF1). Sequences available in the AGC format (3.7GB) and FM-index in the ropebwt3 format (20.3GB). For details, see github.com/lh3/human-asm
GitHub - lh3/human-asm: A collection of high-quality human genomes
A collection of high-quality human genomes. Contribute to lh3/human-asm development by creating an account on GitHub.
github.com
December 3, 2025 at 3:44 AM
Reposted by Jim Shaw
Out after peer-review: www.science.org/doi/full/10....

Our bottom line stayed: never use leave-one-out cross-validation as it has inherent train-test leakage. Consider our Rebalanced version instead!

We now also account for regression and nested cross-validation, with more extensive benchmarking.
November 28, 2025 at 7:32 PM
Reposted by Jim Shaw
Long read Metagenomics, #phage and #prophage in the gut by Ami Bhatt's group. Beautiful data showing changes in phages over two years

#phagesky

www.nature.com/articles/s41...
Long-read metagenomics reveals phage dynamics in the human gut microbiome - Nature
Complex prophage integration dynamics, including low-level induction, cross-family host range and transposase-mediated mobilization, challenge existing paradigms and deepen our understanding of phage–...
www.nature.com
November 26, 2025 at 9:55 PM
Reposted by Jim Shaw
Comparative metagenomics using pan-metagenomics graphs https://www.biorxiv.org/content/10.1101/2025.11.24.690211v1
November 26, 2025 at 10:46 PM
Reposted by Jim Shaw
What is the best strategy to win any contest?

Eliminate your opponents of course.

Recently, my friend @fernpizza.bsky.social showed how plasmids compete intracellularly (check out his paper published in Science today!). With @baym.lol, we now know they can fight.

www.biorxiv.org/content/10.1...
November 20, 2025 at 10:12 PM
Reposted by Jim Shaw
Hot off the press! Our latest paper led by @fernpizza.bsky.social, understanding how plasmids evolve inside cells. These small, self-replicating DNA circles live inside bacteria and carry antibiotic resistance genes, but also compete with one another to replicate. 1/
www.science.org/doi/10.1126/...
Intracellular competition shapes plasmid population dynamics
From populations of multicellular organisms to selfish genetic elements, conflicts between levels of biological organization are central to evolution. Plasmids are extrachromosomal, self-replicating g...
www.science.org
November 20, 2025 at 9:42 PM
Reposted by Jim Shaw
“Bin Chicken” is now published in Nature Methods! It substantially improves genome recovery through rational coassembly 🧬🖥️. Applied to public 🌍 metagenomes, we recovered 24,000 novel species 🦠, including 6 new phyla.
doi.org/10.1038/s415...
@benjwoodcroft.bsky.social @rhysnewell.bsky.social
🧵1/6
November 13, 2025 at 10:09 AM