#deepvariant
Not my specialty, but DeepVariant and DeepConsensus from google might be interesting

I don’t know his much detail is in the open of ONT’s Bonito basecaller
August 19, 2023 at 2:25 AM
Ok so this #deepvariant caller update from #google is BIG.
github.com/google/deepv...

- 1.6x time reduction (40% increase ??)
- Pangenome integration
- New model and improvements over previously model

Excited to try this new update 🧬💻

#bioinformatics #wgs #ngs #variantcalling
Release DeepVariant 1.8.0 · google/deepvariant
In this release: Small model integration: Speed increased by ~1.7x (40% runtime reduction) for WGS, PacBio, and ONT by introduction of additional small model. The small model identifies easy-to-...
github.com
December 9, 2024 at 6:39 AM
June 6, 2025 at 10:48 PM
One hour run of 7-plex genome in a bottle genomes. Duplex efficiency of about 70-80% percent. Variant calling with GATK and Roche ML. DeepVariant slightly better.
February 20, 2025 at 5:28 PM
1/
What reference genome should you use?
Sounds easy. It’s not.
GRCh37? GRCh38? hs37d5?
Have you heard of T2T or the new pan-genome-aware DeepVariant?
This matters more than you think.
www.biorxiv.org/content/10....
October 26, 2025 at 1:45 PM
The Earth BioGenome Project, launched in 2018, aims to sequence 1.85 million eukaryotic species by 2028 at an estimated US$5 billion, has contributed genomes for 4,386 species, and benefits from AI tools like DeepVariant that improve accuracy.
How AI is sequencing the genomes of all known living species on Earth
An ambitious plan to sequence genomes for 1.85 million eukaryotic species on our planet is underway. It's a massive undertaking that will dramatically enhance our understanding of biology, and inform ...
newatlas.com
November 12, 2025 at 3:47 PM
Has anybody used the google DeepVariant caller? How do you like it? https://github.com/google/deepvariant
GitHub - google/deepvariant: DeepVariant is an analysis p...
DeepVariant is an analysis pipeline that uses a deep neur...
github.com
November 16, 2024 at 8:27 PM
1/
What reference genome should you use?
Sounds easy. It’s not.
GRCh37? GRCh38? hs37d5?
Have you heard of T2T or the new pan-genome-aware DeepVariant?
This matters more than you think.
www.biorxiv.org/content/10....
June 10, 2025 at 1:15 PM
Yeah, CNN-based algorithms are transforming pathology for the better, deepvariant is quite good, alphafold is revolutionary, etc.

But those are specialized models, not these LLMs that are taking over everywhere else
October 17, 2025 at 2:29 PM
Last time I did genetics … GATK was new and the best thing ever. I don’t feel that’s the case anymore, especially with non-model organisms. It’s lacking in parallelism, and I don’t have gold standard sets. So, the new rage is #DeepVariant … but my new favorite is bwa-meme, and containers
November 23, 2024 at 1:24 AM
Presenting interesting benchmark datasets from Google DeepVariant case studies (haven't seen these before, they look useful): github.com/google/deepv...
GitHub - google/deepvariant: DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data. - google/deepvariant
github.com
July 16, 2024 at 2:38 PM
Yes, but it turned out to be a really silly mistake. Deepvariant kept saying my reference genome didn't match the alignments. Turns out I pulled the linear reference from the "raw" gfa instead of the gbz that I used with giraffe 🤦
December 25, 2024 at 3:12 PM
The models from DeepVariant (and friends) have shown how not everything needs to be Illumina reads out there to squeeze all the signal in the data. Another example here for Roche SBX Duplex data, the technology that will become commercially available in 2026.
I'll be speaking in this webinar (go.roche.com/sbx-d) on September 10, where I'll share our benchmarks and observations for Roche's SBX sequencing instrument, as well as models developed by our team for SBX data.
Germline Small Variant Calling Workflow for SBX Duplex Data
Wednesday, September 10, 2025 at 12:00 PM Eastern Daylight Time.
go.roche.com
August 21, 2025 at 8:37 AM
Release led by DeepVariant tech lead Kishwar Shafin. Team Engineering manager Pi-Chuan Chang. Small model work led by Lucas Brambrink. Pangenome-aware led by Mobin Asri and Juan Carlos Mier. Fast pipeline by Alexey Kolesnikov. Kinnex/MAS-Seq model by Daniel Cook and Shiyi Yin from Verily. 3/3
December 5, 2024 at 5:57 PM
Joint processing of long- and short-read sequencing data with deep learning improves variant calling. [updated]
Illumina+Nanopore data & DeepVariant enhance accurate, cost-effective germline variant calling, improving diagnostics.
March 21, 2025 at 3:48 PM
Overcoming limitations to customize DeepVariant for domesticated animals with TrioTrain. #GeneticVariants #DeepVariant #NonHumanSpecies #Bioinformatics @genomeresearch.bsky.social‬ 🧬 🖥️
bsky.app/profile/nano...
June 6, 2025 at 8:31 AM
Why does GATK exist when bcftools does a perfectly good job, especially given that they both end up feeding DeepVariant or DeepSomatic? Why isn’t everything shoved into Parquet via GA4GH APIs?

You get the idea
November 7, 2025 at 12:45 AM
Google taught an AI that sorts cat photos to analyze DNA—and it's really good at it, reports @sarahzhang https://www.theatlantic.com/science/archive/2017/12/google-deepvariant-dna/547634/?utm_source=twb
Google Taught an AI That Sorts Cat Photos to Analyze DNA
And it’s very good at it.
www.theatlantic.com
November 26, 2024 at 8:12 PM
Added SPRQ to PacBio training, reducing Indel error on SPRQ by 26%. Added Platinum Pedigree training data for PacBio model, reducing errors by 34% on more extensive Platinum truth. New model and case study for Kinnex/Mas-Seq/Iso-Seq. Additional speed options for GPU pipelines 2/3
December 5, 2024 at 5:57 PM
Want to benefit from pangenomes and want a recipe?

github.com/google/deepv...

Shows a step by step process, with Docker images for how to map to a Pangenome reference w/ vg and calls w/ DeepVariant. Final calls are more accurate and in GRCh38 coordinates. Thanks to the UCSC team for co-development
October 26, 2023 at 4:34 PM
Release of DeepVariant v1.6.

Support for haploid regions, chrX/Y.
Workflow for Pangenome FASTQ-to-VCF.
Major DeepTrio improvements for de novo variants.
Models for CompleteGenomics T7, G400
Add NovaSeqX to training data

Release by Kishwar Shafin

github.com/google/deepv...
Release DeepVariant 1.6.0 · google/deepvariant
Improved support for haploid regions, chrX and chY. Users can specify haploid regions with a flag. Updated case studies show usage and metrics. Added pangenome workflow (FASTQ-to-VCF mapping with V...
github.com
October 26, 2023 at 4:32 PM
Note: OLD POST! (2023), but I just noticed it.

While it's nice to see comparisons, why compare an (at the time) 2 year old GATK against a 5 year old bcftools?

Since then both have come on a lot. It'd be interesting to see new independent comparisons. (Neither can hold up to deepvariant now.)
Important comparison of Bcftools and GTK in simulated Drosophila genomes: "by benchmark analyses with a simulated insect population...Bcftools mpileup performs better than GATK HaplotypeCaller in terms of recovery rate and accuracy regardless of mapping software."
The evaluation of Bcftools mpileup and GATK HaplotypeCaller for variant calling in non-human species...
Scientific Reports - The evaluation of Bcftools mpileup and GATK HaplotypeCaller for variant calling in non-human species
www.nature.com
September 18, 2025 at 8:58 PM
June 6, 2025 at 10:48 PM