James Bonfield
banner
jbonfield.bsky.social
James Bonfield
@jbonfield.bsky.social
Walker, archer, and volunteer woodland warden by weekend, and bioinformatics software engineer and general geek by weekday.

My favourite prime is 15551, my favourite colour is, obviously, octarine, and I love nothing more than being immersed in nature.
It's curious to see grokipedia is up and has samtools, SAM and BAM pages all with a flurry of edits. Oddly, most of those edits are correcting minor edits in the Wikipedia pages it based its content on (presumably via some AI rewriting tool).

A shame the originals weren't edited too.
October 29, 2025 at 7:54 PM
Reposted by James Bonfield
The Metagraph paper is out in Nature; it showed up in my feeds today! Congratulations to Mikhail Karasikov, @gxxxr.bsky.social, @akkah21.bsky.social and all of the other authors (whom I'd love to follow on Bluesky if I can find you ;P) www.nature.com/articles/s41...
Efficient and accurate search in petabase-scale sequence repositories - Nature
MetaGraph enables scalable indexing of large sets of DNA, RNA or protein sequences using annotated de Bruijn graphs.
www.nature.com
October 9, 2025 at 2:40 PM
Reposted by James Bonfield
"OpenZL is our answer to the tension between the performance of format-specific compressors and the maintenance simplicity of a single executable binary."
engineering.fb.com/2025/10/06/d...
October 6, 2025 at 8:58 PM
Reposted by James Bonfield
Delighted to finally announce a preprint describing the Q100 project! “A complete diploid human genome benchmark for personalized genomics” For which we finished HG002 to near-perfect accuracy: www.biorxiv.org/content/10.1... 🧵[1/14]
A complete diploid human genome benchmark for personalized genomics
Human genome resequencing typically involves mapping reads to a reference genome to call variants; however, this approach suffers from both technical and reference biases, leaving many duplicated and ...
www.biorxiv.org
September 22, 2025 at 5:01 PM
Note: OLD POST! (2023), but I just noticed it.

While it's nice to see comparisons, why compare an (at the time) 2 year old GATK against a 5 year old bcftools?

Since then both have come on a lot. It'd be interesting to see new independent comparisons. (Neither can hold up to deepvariant now.)
Important comparison of Bcftools and GTK in simulated Drosophila genomes: "by benchmark analyses with a simulated insect population...Bcftools mpileup performs better than GATK HaplotypeCaller in terms of recovery rate and accuracy regardless of mapping software."
The evaluation of Bcftools mpileup and GATK HaplotypeCaller for variant calling in non-human species...
Scientific Reports - The evaluation of Bcftools mpileup and GATK HaplotypeCaller for variant calling in non-human species
www.nature.com
September 18, 2025 at 8:58 PM
Reposted by James Bonfield
I'm sorry, worldwide, irrevocable, non-exclusive, transferable permission to my voice and likeness? For what now? In any manner for any purpose???

This is in academia/.edu's new ToS, which you're prompted to agree to on login. Anyway I'll be jumping ship. You can find my stuff at hcommons.org.
September 17, 2025 at 5:16 PM
Reposted by James Bonfield
minimap2.com is potentially a phishing site. Please don't use anything from that website.
github.com/lh3/minimap2...
Phishing site : minimap2.com · Issue #1316 · lh3/minimap2
Not sure how to label this one, but I have come across a website minimap2.com which appears to be AI generated but is serving it's own copy of the Github repository. If you search the address or em...
github.com
September 9, 2025 at 3:40 PM
Heads up: ignore samtools dot org, similarly minimap2 dot com and likely others. It's owned by a known phishing site and while the binaries they offer look valid currently (but note they may be serving us different binaries to others), that could change.

Ie: it's not us (Samtools team)! Be warned
September 15, 2025 at 8:40 AM
Reposted by James Bonfield
Nigel Farage looks uncomfortable as Jamie Raskin uses his opening statement to absolutely demolish him
September 3, 2025 at 4:39 PM
Reposted by James Bonfield
Preprint alert!
We present K2Rmini, an ultra-fast, grep-like tool that extracts sequences of interest from FASTA/FASTQ files based on their k-mer content.
www.biorxiv.org/content/10.1...
A thread
Accelerating k-mer-based sequence filtering
The exponential growth of global sequencing data repositories presents both analytical challenges and opportunities. While k - mer-based indexing has improved scalability over traditional alignment fo...
www.biorxiv.org
July 2, 2025 at 1:00 PM
Are juvenile common lizards normally black, or is this a melanistic one? It was tiny. Seen at #RSPB #Fowlmere.
June 27, 2025 at 6:10 PM
Reposted by James Bonfield
Release 1.22 of HTSlib, SAMtools, and BCFtools is now available from GitHub. See htslib.org/download/ for links to tarballs and release notes. 🧪

#samtools #bcftools #htslib #bioinformatics
Samtools
Samtools
htslib.org
May 30, 2025 at 10:22 AM
Reposted by James Bonfield
📢 HPRC Release 2 is here!

Now with phased genomes from 200+ individuals, a 5x increase from Release 1.

Explore sequencing data, assemblies, annotations & alignments in our interactive data explorer ⬇️:

humanpangenome.org/hprc-data-re...
May 12, 2025 at 1:15 PM
The magnificent seven: Oh deer!
Fallow deer, Potton Wood, Beds.

Small herds like this are fine and magical to behold, but like everything in nature things need to be kept in balance. We killed the wolves, hence deer stalking has its place as the man-made alternative to the wild balance.
May 5, 2025 at 8:51 AM
Some of the lovely blue #wildflowers in the garden currently. #nature
May 3, 2025 at 6:26 PM
Marvelling at a wonderful patch of Herb Paris (Paris quadrifolia) in a #wildlifebcn West Cambs ancient woodland. Plus my first (Midland) Hawthorn of the year in flower, and some lovely Bugle.

#wildflowerhour, even though none of them are tiny plants.
April 20, 2025 at 8:32 PM
Delightful Pasque Flowers on this evenings walk. A brief stop-over to Therfield Heath on the way home from work. #wildflowers #easter #nature
April 14, 2025 at 10:42 PM
A beautiful morning for walking around #gamlingaywood today. Nice to see the ash stool has regained the Wood Anemone crown again. The Herb Paris patch is lookling spendid too
April 11, 2025 at 9:42 PM
My first fresh looking Large White butterfly of the year, fluttering around the greenhouse until I caught it.
April 10, 2025 at 2:00 PM
Reposted by James Bonfield
Going to recode the C. elegans genome to make a direworm
April 8, 2025 at 2:06 PM
Reposted by James Bonfield
Public service announcement. They are not Dire Wolves. They have 20 single letter changes in their entire genomes. I’ve done shits with more mutations.

Every time journalists write up a Colossus press release, They are making people stupider. Client journalism by a ridiculous company.
April 7, 2025 at 8:02 PM
An early trendsetter - the English Bluebell, in Gamlingay Wood. I only saw 2 of them out of hundreds of thousands (or more). Signs of things to come.

Lots of wood anemones, lesser celandines, primroses, oxlips and wild strawberries too.

#wildflower #sssi #uknature
March 28, 2025 at 9:19 PM
A reminder to SAMtools / HTSlib users, from the next release (probably April 2025) CRAM 3.1 will be the default version used. Use "samtools view -O cram,version=3.0" to switch back to the current 3.0.

(But note 3.1 should be smaller and faster, especially on Illumina.)
March 26, 2025 at 11:14 PM
Reposted by James Bonfield
Blimey. Looks like pretty much ALL of my books have been pirated by LibGen - the place that’s been scraped by generative AI developers. (thanks @gregjenner.bsky.social):
www.theatlantic.com/technology/a...
March 22, 2025 at 9:41 AM