Daniel Baker
Daniel Baker
@dnbaker.bsky.social
Computational genomics, with focus on indexing + search, data re-use, and scalable algorithms.

C++/Rust enthusiast, hardware/algorithms.

Johns Hopkins PhD, prior ARUP Laboratories and PacBio.

Currently building GPU-accelerated platform at Roche.
Reposted by Daniel Baker
“StarPhase: Comprehensive Phase-Aware Pharmacogenomic Diplotyper for Long-Read Sequencing Data” is now on biorxiv! In this work, we explore the use of long-read sequencing (#PacBio #HiFi) for #pharmacogenomics #PGx. 1/N

Pre-print: doi.org/10.1101/2024...
Repo: github.com/PacificBiosc...
GitHub - PacificBiosciences/pb-StarPhase: A phase-aware pharmacogenomic diplotyper for PacBio datasets
A phase-aware pharmacogenomic diplotyper for PacBio datasets - PacificBiosciences/pb-StarPhase
github.com
December 11, 2024 at 2:30 PM
Reposted by Daniel Baker
Checkout out a new release of pbfusion (0.5.0) for accurate detection and visualization of fusion transcripts from @pacbio.bsky.social HiFi data.

github.com/PacificBiosc...
December 13, 2024 at 10:15 PM
Reposted by Daniel Baker
1/5 We (Nate Brown, @oahmed.bsky.social, Travis Gagie, and @benlangmead.bsky.social) developed Movi, a cache-efficient full-text pangenome index.  It's the fastest full-text index for pangenomes, particularly appropriate for adaptive sampling where time budget is important.
November 7, 2023 at 6:46 PM
Reposted by Daniel Baker
Good lord, the FoldSeek clustering of all AlphaFold structures is incredible. Just stunning. Vastly expands the taxonomic breadth of microbes that encode homologous structures that we care about. Paper here: www.nature.com/articles/s41... and website to search here: cluster.foldseek.com
AFDB Clusters
cluster.foldseek.com
September 13, 2023 at 7:43 PM