James Bonfield
@jbonfield.bsky.social
Walker, archer, and volunteer woodland warden by weekend, and bioinformatics software engineer and general geek by weekday.
My favourite prime is 15551, my favourite colour is, obviously, octarine, and I love nothing more than being immersed in nature.
My favourite prime is 15551, my favourite colour is, obviously, octarine, and I love nothing more than being immersed in nature.
It's curious to see grokipedia is up and has samtools, SAM and BAM pages all with a flurry of edits. Oddly, most of those edits are correcting minor edits in the Wikipedia pages it based its content on (presumably via some AI rewriting tool).
A shame the originals weren't edited too.
A shame the originals weren't edited too.
October 29, 2025 at 7:54 PM
It's curious to see grokipedia is up and has samtools, SAM and BAM pages all with a flurry of edits. Oddly, most of those edits are correcting minor edits in the Wikipedia pages it based its content on (presumably via some AI rewriting tool).
A shame the originals weren't edited too.
A shame the originals weren't edited too.
Reposted by James Bonfield
The Metagraph paper is out in Nature; it showed up in my feeds today! Congratulations to Mikhail Karasikov, @gxxxr.bsky.social, @akkah21.bsky.social and all of the other authors (whom I'd love to follow on Bluesky if I can find you ;P) www.nature.com/articles/s41...
Efficient and accurate search in petabase-scale sequence repositories - Nature
MetaGraph enables scalable indexing of large sets of DNA, RNA or protein sequences using annotated de Bruijn graphs.
www.nature.com
October 9, 2025 at 2:40 PM
The Metagraph paper is out in Nature; it showed up in my feeds today! Congratulations to Mikhail Karasikov, @gxxxr.bsky.social, @akkah21.bsky.social and all of the other authors (whom I'd love to follow on Bluesky if I can find you ;P) www.nature.com/articles/s41...
Reposted by James Bonfield
"OpenZL is our answer to the tension between the performance of format-specific compressors and the maintenance simplicity of a single executable binary."
engineering.fb.com/2025/10/06/d...
engineering.fb.com/2025/10/06/d...
October 6, 2025 at 8:58 PM
"OpenZL is our answer to the tension between the performance of format-specific compressors and the maintenance simplicity of a single executable binary."
engineering.fb.com/2025/10/06/d...
engineering.fb.com/2025/10/06/d...
Reposted by James Bonfield
Delighted to finally announce a preprint describing the Q100 project! “A complete diploid human genome benchmark for personalized genomics” For which we finished HG002 to near-perfect accuracy: www.biorxiv.org/content/10.1... 🧵[1/14]
A complete diploid human genome benchmark for personalized genomics
Human genome resequencing typically involves mapping reads to a reference genome to call variants; however, this approach suffers from both technical and reference biases, leaving many duplicated and ...
www.biorxiv.org
September 22, 2025 at 5:01 PM
Delighted to finally announce a preprint describing the Q100 project! “A complete diploid human genome benchmark for personalized genomics” For which we finished HG002 to near-perfect accuracy: www.biorxiv.org/content/10.1... 🧵[1/14]
Note: OLD POST! (2023), but I just noticed it.
While it's nice to see comparisons, why compare an (at the time) 2 year old GATK against a 5 year old bcftools?
Since then both have come on a lot. It'd be interesting to see new independent comparisons. (Neither can hold up to deepvariant now.)
While it's nice to see comparisons, why compare an (at the time) 2 year old GATK against a 5 year old bcftools?
Since then both have come on a lot. It'd be interesting to see new independent comparisons. (Neither can hold up to deepvariant now.)
Important comparison of Bcftools and GTK in simulated Drosophila genomes: "by benchmark analyses with a simulated insect population...Bcftools mpileup performs better than GATK HaplotypeCaller in terms of recovery rate and accuracy regardless of mapping software."
The evaluation of Bcftools mpileup and GATK HaplotypeCaller for variant calling in non-human species...
Scientific Reports - The evaluation of Bcftools mpileup and GATK HaplotypeCaller for variant calling in non-human species
www.nature.com
September 18, 2025 at 8:58 PM
Note: OLD POST! (2023), but I just noticed it.
While it's nice to see comparisons, why compare an (at the time) 2 year old GATK against a 5 year old bcftools?
Since then both have come on a lot. It'd be interesting to see new independent comparisons. (Neither can hold up to deepvariant now.)
While it's nice to see comparisons, why compare an (at the time) 2 year old GATK against a 5 year old bcftools?
Since then both have come on a lot. It'd be interesting to see new independent comparisons. (Neither can hold up to deepvariant now.)
Reposted by James Bonfield
I'm sorry, worldwide, irrevocable, non-exclusive, transferable permission to my voice and likeness? For what now? In any manner for any purpose???
This is in academia/.edu's new ToS, which you're prompted to agree to on login. Anyway I'll be jumping ship. You can find my stuff at hcommons.org.
This is in academia/.edu's new ToS, which you're prompted to agree to on login. Anyway I'll be jumping ship. You can find my stuff at hcommons.org.
September 17, 2025 at 5:16 PM
I'm sorry, worldwide, irrevocable, non-exclusive, transferable permission to my voice and likeness? For what now? In any manner for any purpose???
This is in academia/.edu's new ToS, which you're prompted to agree to on login. Anyway I'll be jumping ship. You can find my stuff at hcommons.org.
This is in academia/.edu's new ToS, which you're prompted to agree to on login. Anyway I'll be jumping ship. You can find my stuff at hcommons.org.
Reposted by James Bonfield
minimap2.com is potentially a phishing site. Please don't use anything from that website.
github.com/lh3/minimap2...
github.com/lh3/minimap2...
Phishing site : minimap2.com · Issue #1316 · lh3/minimap2
Not sure how to label this one, but I have come across a website minimap2.com which appears to be AI generated but is serving it's own copy of the Github repository. If you search the address or em...
github.com
September 9, 2025 at 3:40 PM
minimap2.com is potentially a phishing site. Please don't use anything from that website.
github.com/lh3/minimap2...
github.com/lh3/minimap2...
Heads up: ignore samtools dot org, similarly minimap2 dot com and likely others. It's owned by a known phishing site and while the binaries they offer look valid currently (but note they may be serving us different binaries to others), that could change.
Ie: it's not us (Samtools team)! Be warned
Ie: it's not us (Samtools team)! Be warned
September 15, 2025 at 8:40 AM
Heads up: ignore samtools dot org, similarly minimap2 dot com and likely others. It's owned by a known phishing site and while the binaries they offer look valid currently (but note they may be serving us different binaries to others), that could change.
Ie: it's not us (Samtools team)! Be warned
Ie: it's not us (Samtools team)! Be warned
Reposted by James Bonfield
Nigel Farage looks uncomfortable as Jamie Raskin uses his opening statement to absolutely demolish him
September 3, 2025 at 4:39 PM
Nigel Farage looks uncomfortable as Jamie Raskin uses his opening statement to absolutely demolish him
Reposted by James Bonfield
Preprint alert!
We present K2Rmini, an ultra-fast, grep-like tool that extracts sequences of interest from FASTA/FASTQ files based on their k-mer content.
www.biorxiv.org/content/10.1...
A thread
We present K2Rmini, an ultra-fast, grep-like tool that extracts sequences of interest from FASTA/FASTQ files based on their k-mer content.
www.biorxiv.org/content/10.1...
A thread
Accelerating k-mer-based sequence filtering
The exponential growth of global sequencing data repositories presents both analytical challenges and opportunities. While k - mer-based indexing has improved scalability over traditional alignment fo...
www.biorxiv.org
July 2, 2025 at 1:00 PM
Preprint alert!
We present K2Rmini, an ultra-fast, grep-like tool that extracts sequences of interest from FASTA/FASTQ files based on their k-mer content.
www.biorxiv.org/content/10.1...
A thread
We present K2Rmini, an ultra-fast, grep-like tool that extracts sequences of interest from FASTA/FASTQ files based on their k-mer content.
www.biorxiv.org/content/10.1...
A thread
Reposted by James Bonfield
Release 1.22 of HTSlib, SAMtools, and BCFtools is now available from GitHub. See htslib.org/download/ for links to tarballs and release notes. 🧪
#samtools #bcftools #htslib #bioinformatics
#samtools #bcftools #htslib #bioinformatics
Samtools
Samtools
htslib.org
May 30, 2025 at 10:22 AM
Release 1.22 of HTSlib, SAMtools, and BCFtools is now available from GitHub. See htslib.org/download/ for links to tarballs and release notes. 🧪
#samtools #bcftools #htslib #bioinformatics
#samtools #bcftools #htslib #bioinformatics
Reposted by James Bonfield
📢 HPRC Release 2 is here!
Now with phased genomes from 200+ individuals, a 5x increase from Release 1.
Explore sequencing data, assemblies, annotations & alignments in our interactive data explorer ⬇️:
humanpangenome.org/hprc-data-re...
Now with phased genomes from 200+ individuals, a 5x increase from Release 1.
Explore sequencing data, assemblies, annotations & alignments in our interactive data explorer ⬇️:
humanpangenome.org/hprc-data-re...
May 12, 2025 at 1:15 PM
📢 HPRC Release 2 is here!
Now with phased genomes from 200+ individuals, a 5x increase from Release 1.
Explore sequencing data, assemblies, annotations & alignments in our interactive data explorer ⬇️:
humanpangenome.org/hprc-data-re...
Now with phased genomes from 200+ individuals, a 5x increase from Release 1.
Explore sequencing data, assemblies, annotations & alignments in our interactive data explorer ⬇️:
humanpangenome.org/hprc-data-re...
The magnificent seven: Oh deer!
Fallow deer, Potton Wood, Beds.
Small herds like this are fine and magical to behold, but like everything in nature things need to be kept in balance. We killed the wolves, hence deer stalking has its place as the man-made alternative to the wild balance.
Fallow deer, Potton Wood, Beds.
Small herds like this are fine and magical to behold, but like everything in nature things need to be kept in balance. We killed the wolves, hence deer stalking has its place as the man-made alternative to the wild balance.
May 5, 2025 at 8:51 AM
The magnificent seven: Oh deer!
Fallow deer, Potton Wood, Beds.
Small herds like this are fine and magical to behold, but like everything in nature things need to be kept in balance. We killed the wolves, hence deer stalking has its place as the man-made alternative to the wild balance.
Fallow deer, Potton Wood, Beds.
Small herds like this are fine and magical to behold, but like everything in nature things need to be kept in balance. We killed the wolves, hence deer stalking has its place as the man-made alternative to the wild balance.
Some of the lovely blue #wildflowers in the garden currently. #nature
May 3, 2025 at 6:26 PM
Some of the lovely blue #wildflowers in the garden currently. #nature
Marvelling at a wonderful patch of Herb Paris (Paris quadrifolia) in a #wildlifebcn West Cambs ancient woodland. Plus my first (Midland) Hawthorn of the year in flower, and some lovely Bugle.
#wildflowerhour, even though none of them are tiny plants.
#wildflowerhour, even though none of them are tiny plants.
April 20, 2025 at 8:32 PM
Marvelling at a wonderful patch of Herb Paris (Paris quadrifolia) in a #wildlifebcn West Cambs ancient woodland. Plus my first (Midland) Hawthorn of the year in flower, and some lovely Bugle.
#wildflowerhour, even though none of them are tiny plants.
#wildflowerhour, even though none of them are tiny plants.
Delightful Pasque Flowers on this evenings walk. A brief stop-over to Therfield Heath on the way home from work. #wildflowers #easter #nature
April 14, 2025 at 10:42 PM
Delightful Pasque Flowers on this evenings walk. A brief stop-over to Therfield Heath on the way home from work. #wildflowers #easter #nature
A beautiful morning for walking around #gamlingaywood today. Nice to see the ash stool has regained the Wood Anemone crown again. The Herb Paris patch is lookling spendid too
April 11, 2025 at 9:42 PM
A beautiful morning for walking around #gamlingaywood today. Nice to see the ash stool has regained the Wood Anemone crown again. The Herb Paris patch is lookling spendid too
My first fresh looking Large White butterfly of the year, fluttering around the greenhouse until I caught it.
April 10, 2025 at 2:00 PM
My first fresh looking Large White butterfly of the year, fluttering around the greenhouse until I caught it.
Reposted by James Bonfield
Going to recode the C. elegans genome to make a direworm
April 8, 2025 at 2:06 PM
Going to recode the C. elegans genome to make a direworm
Reposted by James Bonfield
Public service announcement. They are not Dire Wolves. They have 20 single letter changes in their entire genomes. I’ve done shits with more mutations.
Every time journalists write up a Colossus press release, They are making people stupider. Client journalism by a ridiculous company.
Every time journalists write up a Colossus press release, They are making people stupider. Client journalism by a ridiculous company.
April 7, 2025 at 8:02 PM
Public service announcement. They are not Dire Wolves. They have 20 single letter changes in their entire genomes. I’ve done shits with more mutations.
Every time journalists write up a Colossus press release, They are making people stupider. Client journalism by a ridiculous company.
Every time journalists write up a Colossus press release, They are making people stupider. Client journalism by a ridiculous company.
An early trendsetter - the English Bluebell, in Gamlingay Wood. I only saw 2 of them out of hundreds of thousands (or more). Signs of things to come.
Lots of wood anemones, lesser celandines, primroses, oxlips and wild strawberries too.
#wildflower #sssi #uknature
Lots of wood anemones, lesser celandines, primroses, oxlips and wild strawberries too.
#wildflower #sssi #uknature
March 28, 2025 at 9:19 PM
An early trendsetter - the English Bluebell, in Gamlingay Wood. I only saw 2 of them out of hundreds of thousands (or more). Signs of things to come.
Lots of wood anemones, lesser celandines, primroses, oxlips and wild strawberries too.
#wildflower #sssi #uknature
Lots of wood anemones, lesser celandines, primroses, oxlips and wild strawberries too.
#wildflower #sssi #uknature
A reminder to SAMtools / HTSlib users, from the next release (probably April 2025) CRAM 3.1 will be the default version used. Use "samtools view -O cram,version=3.0" to switch back to the current 3.0.
(But note 3.1 should be smaller and faster, especially on Illumina.)
(But note 3.1 should be smaller and faster, especially on Illumina.)
March 26, 2025 at 11:14 PM
A reminder to SAMtools / HTSlib users, from the next release (probably April 2025) CRAM 3.1 will be the default version used. Use "samtools view -O cram,version=3.0" to switch back to the current 3.0.
(But note 3.1 should be smaller and faster, especially on Illumina.)
(But note 3.1 should be smaller and faster, especially on Illumina.)
Reposted by James Bonfield
Blimey. Looks like pretty much ALL of my books have been pirated by LibGen - the place that’s been scraped by generative AI developers. (thanks @gregjenner.bsky.social):
www.theatlantic.com/technology/a...
www.theatlantic.com/technology/a...
March 22, 2025 at 9:41 AM
Blimey. Looks like pretty much ALL of my books have been pirated by LibGen - the place that’s been scraped by generative AI developers. (thanks @gregjenner.bsky.social):
www.theatlantic.com/technology/a...
www.theatlantic.com/technology/a...