Adam Phillippy
banner
aphillippy.bsky.social
Adam Phillippy
@aphillippy.bsky.social
Finished a human genome, working on a few more 👨‍💻
Lab: https://genomeinformatics.github.io
Posts are my own
Was an abstract-selected talk on new work, rather than an invitation. Super bummed not to be there :(
November 7, 2025 at 3:04 PM
Here I am! Looks like you’re missing an L. Are you thinking of Heng’s dipcall? github.com/lh3/dipcall
GitHub - lh3/dipcall: Reference-based variant calling pipeline for a pair of phased haplotype assemblies
Reference-based variant calling pipeline for a pair of phased haplotype assemblies - lh3/dipcall
github.com
October 20, 2025 at 6:12 PM
Reposted by Adam Phillippy
Read the preprint here with all the details, plus lots of other long-read powered analysis! www.medrxiv.org/content/10.1...
Population-scale Long-read Sequencing in the All of Us Research Program
The All of Us Research Program (AoU) is a national biobank seeking to enroll one million individuals in the United States to link genomic and biomedical data, including short- and long-read whole-geno...
www.medrxiv.org
October 14, 2025 at 5:40 PM
Congrats Krystal and co!
October 15, 2025 at 1:01 PM
Reposted by Adam Phillippy
actually it reminds me more of finding 1000s of human contaminants annotated as proteins within draft bacterial genomes in GenBank, which we published in 2018 (and @aphillippy.bsky.social knows this work): pubmed.ncbi.nlm.nih.gov/31064768/
Human contamination in bacterial genomes has created thousands of spurious proteins - PubMed
Contaminant sequences that appear in published genomes can cause numerous problems for downstream analyses, particularly for evolutionary studies and metagenomics projects. Our large-scale scan of com...
pubmed.ncbi.nlm.nih.gov
October 14, 2025 at 3:05 PM
The pangenome resources and genome assembly/inference approaches we are building will eventually enable complete, personalized “T2T” genomes for everyone. This is the thesis of the Q100 project (www.biorxiv.org/content/10.1...) and what my group is currently working towards. Stay tuned... [10/10]
A complete diploid human genome benchmark for personalized genomics
Human genome resequencing typically involves mapping reads to a reference genome to call variants; however, this approach suffers from both technical and reference biases, leaving many duplicated and ...
www.biorxiv.org
October 13, 2025 at 8:17 PM
Each genome is unique and should be treated as such. Analyzing the complete, personalized genome of an individual (yes, with the help of AI) will reduce reference bias and allow for the deep characterization of rare and novel structural variants that are the basis of many genetic diseases [9/10]
October 13, 2025 at 8:16 PM
“I am suggesting we should flip that model, and we should map the metadata to the sequence of the patient, meaning we complete the patient’s genome, and then we take all of that metadata and we annotate it onto the personalized reference.” [8/10]
October 13, 2025 at 8:16 PM
By sampling the pangenome to build good priors on what a typical genome looks like, you can do a much better job of inferring a patient’s genome “Perhaps, in the future, scientists can depart from the approach of mapping sequencing reads ... and accessing data in the context of the reference” [7/10]
October 13, 2025 at 8:16 PM
And there is A LOT more structural variation in a typical human genome than most people realize, even between the two haplotypes of a single person’s genome, that can have big effects but are rarely captured, e.g. recurrent inversions doi.org/10.1016/j.ce... [6/10]
October 13, 2025 at 8:16 PM
This point is often lost. One enormous benefit of the Human Pangenome Project is that it improves our general understanding of natural human variation. It’s like the 1000 Genomes Project, but inclusive of ALL variation, not just the variants you can see with short-read variant calling [5/10]
October 13, 2025 at 8:15 PM
When [Phillippy] hears scientists say: “Oh, the pangenome is not for me,” he tells them, “You’re using it.” Illumina’s DRAGEN software already calls variants using graph genomes. Approaches related to graph genomes are, he says, “happening behind the scenes.” [4/10]
October 13, 2025 at 8:15 PM
“Conceptually, the pangenome represents all of humankind’s genetic information ... Population projects cannot sample each individual in the world, so the idea is to represent the population’s multitude.” This cannot be done with singular references, enter the HPRC @humanpangenome.bsky.social [3/10]
October 13, 2025 at 8:15 PM