Paul Blischak
pblischak.bsky.social
Paul Blischak
@pblischak.bsky.social
Genomics, Data Science, & Computational Genetics at Bayer Crop Science :: pblischak.github.io :: PhD + BSc at Ohio State :: Postdoc at U. Arizona :: (he/him)
If you want to try something that doesn't include the phasing stuff (I'm not sure if it does multiallelic genotyping though), there's updog:

github.com/dcgerard/updog
GitHub - dcgerard/updog: Flexible Genotyping of Polyploids using Next Generation Sequencing Data
Flexible Genotyping of Polyploids using Next Generation Sequencing Data - dcgerard/updog
github.com
June 12, 2025 at 5:09 PM
3. Unfortunately, I think ~50x coverage may not be enough to distinguish between all of the different possible heterozygous states in a hexaploid, so the model could also be having trouble finding the best genotype estimates
June 12, 2025 at 5:08 PM
2. In addition to the combinatorial issues, there's an implicit assumption of autopolyploidy (randomly sampled alleles), so things like fixed heterozygosity can really throw off the probability calculations
June 12, 2025 at 5:03 PM
If I had to guess, I think these could be the main causes:

1. Freebayes models multiallelic genotypes, and uses locally phased windows to share information. For a hexaploid, considering all of the combinations of multiallelic genotypes at multiple sites can get really massive
June 12, 2025 at 5:01 PM
Rust and serde really are amazing tools for data. I’m curious if you’ve tried pydantic – it looks interesting but I haven’t spent the time to learn it well yet. It’s not the level of static validation that the compiler gives but maybe with mypy and type hints it could make things better in Python?
April 9, 2025 at 4:00 PM