Ivar Grytten
Ivar Grytten
@ivargrytten.bsky.social
Bioinformatics, Python
Some shameless BioNumPy (github.com/bionumpy/bio...) advertisement in the end: This project would not have been possible without it. I love how simple it is now to just read VCFs, process genotype matrices, read FASTQ files, compute kmers, etc, which has enabled fast prototyping and experimentation.
GitHub - bionumpy/bionumpy: Python library for array programming on biological datasets. Documentati...
Python library for array programming on biological datasets. Documentation available at: https://bionumpy.github.io/bionumpy/ - GitHub - bionumpy/bionumpy: Python library for array programming on b...
github.com
December 25, 2023 at 1:48 PM
We can maybe forget about high read coverage. There is almost no accuracy gain in going from 5x to 30x coverage. This might be because imputation is such a big part of the prediction model, meaning that 5x is more than enough to guide the model in the right direction.
December 25, 2023 at 1:48 PM
Pangenome size matters -> we should as a community invest in making larger pangenomes. This is maybe somewhat obvious, but nice to get it confirmed. X-axis is number of individuals in pangenome.
December 25, 2023 at 1:47 PM
SNPs/indels are important when genotyping SVs. Our experiments show that SV genotyping accuracy drastically increases when we add more SNPs/indels to the pangenome. The x-axis in the plot below is allele frequency - SNPs/indels with freq lower than x-axis value are filtered away.
December 25, 2023 at 1:46 PM
We were surprised by how good GLIMPSE is at imputing SVs! We ended up simply relying on GLIMPSE in KAGE2, rather than using our own imputation model. Really appreciate those rare moments when existing bioinformatics tools actually work seamlessly together to make good results.
December 25, 2023 at 1:45 PM
Genotyping SVs from reads alone yields much lower accuracy than when combined with imputation. Even KAGE/PanGenie with very few reads (0.5x) perform much better than e.g. BayesTyper (30x coverage) that does not do imputation.
December 25, 2023 at 1:45 PM