Matt Holt
banner
holtjma.bsky.social
Matt Holt
@holtjma.bsky.social
Staff scientist at @PacBio; formerly @hudsonalpha; avid gamer; opinions are my own
Even if you do not (yet) fully buy in to basepair scoring, Aardvark includes a traditional genotype score... and it calculates both sets of scoring metrics *really* fast!

For small variants, on average 16x faster than hap.py, with most runs finishing <2 minutes (16 threads).

(6/N)
October 6, 2025 at 8:14 PM
Since Aardvark looks at sequences, it enables some comparisons that were previously very challenging:

1. Tandem repeat (TR) v. TR benchmarking
2. TR v. small variant benchmarking
3. Structural variant (SV) benchmarking
4. Joint benchmarking (small + SV)

(5/N)
October 6, 2025 at 8:13 PM
The main addition in Aardvark is the "basepair" scoring scheme, which compares local haplotype *sequences* instead of variants and genotypes. See the attached figure for a quick example of how basepair scoring compares to genotype scoring.
(2/N)
October 6, 2025 at 8:09 PM
I have never seen a more beautiful image in my life, hype train!
April 2, 2025 at 2:21 PM
Xiao Chen #ACMGMtg25 describing Kivvi tool for assembling long repeat units in medically relevant genes (KIV2 and D4Z4) using #PacBio HiFi reads. Large repeats accurately assembled!
March 20, 2025 at 7:10 PM
The highly elusive North Alabama snow day is here! ⛄️
January 10, 2025 at 1:19 PM
Just as a follow up, I was able to find the script that did this and test it using the original HiFi VCFs (i.e. high coverage) but the downsampled HiFi data. The attached figure is more in line with what I would expect. Higher risk of switchflip errors at the lower coverages of course.
January 9, 2025 at 5:53 PM
@3rdreviewer.bsky.social This figure is downsampling with *just HiFi* in our HiPhase supplement. This is not exactly what you want because variant calling was a part of the experiment. Given quality variant calls, I expect the NG50 would be even better. Paper here: academic.oup.com/bioinformati...
January 9, 2025 at 4:07 PM
Forgot to attach an image to 8/10, so here it is! An example of a CYP2D6 duplication event that is directly observable with long-read sequencing. We've enhanced this image to make it more obvious, but orange reads have *direct evidence* of two CYP2D6 *4.004 alleles. 11/10
December 11, 2024 at 3:07 PM
With collaborators at Children’s Mercy Kansas City, Estonian Genome Centre, HudsonAlpha Institute for Biotechnology, and SingHealth Duke-NUS Institute of Precision Medicine; we also explore population haplotype distributions of these genes in 1,452 WGS datasets! 7/N
December 11, 2024 at 2:35 PM
Across our entire benchmark, StarPhase diplotypes exactly match for 96.2%, an additional 3.3% are minor discrepancies caused by outdated comparators or database limitations, and we identify only 16 mismatches (0.5%)! Manual inspection of each supported the StarPhase diplotypes. 5/N
December 11, 2024 at 2:33 PM
It’s a packed room for Xiao Chen’s talk on resolving paralogous genes with #PacBio HiFi sequencing! #ASHG24
November 8, 2024 at 5:48 PM
One Republic putting on quite a show for us!

#ASHG24 #PacBio
November 7, 2024 at 4:45 AM
Snowy blue bear greeting us at #ASHG24 this morning!
November 6, 2024 at 4:01 PM
Interested in long-read pharmacogenomics? Then I have some exciting things to show you at #ASHG24... Looking forward to next week in Denver!

#PacBio #PGx
October 28, 2024 at 8:08 PM
Up bright and early for the Liz Hurley Ribbon Run for cancer awareness!
October 19, 2024 at 2:14 PM
FYI, you can basically take the Path all the way to Nathan Phillips Square from the #ACMGMtg24 conference center, don’t let the rain keep you from exploring!
March 14, 2024 at 10:13 PM
Happy Friday!
February 23, 2024 at 9:46 PM
We just added a couple tracks to MethBat segmentation that allow the output of haplotype-specific methylation status (methylated, unmethylated, or no data). Example IGV images of what this might look like are attached.

Full release notes: github.com/PacificBiosc...
February 20, 2024 at 4:25 PM
The biggest change relative to our pre-print is that HiPhase can now phase tandem repeat calls #STR from TRGT in addition to the small and structural variants from before.  On average, this added ~68K additional phased variants per sample that were previously ignored!
January 26, 2024 at 3:08 PM
Poster is up! Let’s talk present and future of #PacBio #HiFi #phasing today from 3-5 PM at #ASHG23 poster number 3397!
November 2, 2023 at 1:35 PM
Whoever made Pokémon Halloween card packs is a genius, kids are going nuts over here!
November 1, 2023 at 12:21 PM
Getting ready for the #RibbonRun for Huntsville hospital!
October 21, 2023 at 1:20 PM
Had a great time exploring the tide pools of San Diego and syncing up with @pacbio.bsky.social colleagues!
October 11, 2023 at 1:43 PM
That first auto PR after trying to get bioconda autobump to work for months…
September 21, 2023 at 10:18 PM