Alex Leonard
icyhawaiian.bsky.social
Alex Leonard
@icyhawaiian.bsky.social
Trying to turn wood into marble in bovine pangenomics. If I'm not pushing commits, I'm hopefully off diving.
Oooh quantum century!
November 27, 2025 at 5:45 AM
Yeah we definitely have some cases of these foldback reads, this is useful because these reads need some QC'ing but technically ~100% of the sequence aligns so they look as good as any other SV-containing read with sup alignments.
October 8, 2025 at 10:02 AM
Highly repetitive is easier to eyeball. We have some 700 Kb reads where 400 bp (yes bp) align to the reference the rest is soft-clipped "TGTGTG...". Some unaligned reads are legitimately just telomeres (the cattle reference is incomplete), so hard to find a single metric to separate real from fake
October 8, 2025 at 10:00 AM
Sadly have to decide between maximising the throughput or hitting the Euler hard quota with uncompressed genomes. Painful to see how much time it adds though
October 3, 2025 at 1:43 PM
Compared to `mash triangle` for 208 bgzipp'd cattle genomes on 4 threads, it only took 65 min vs 230 min. Used slightly more RAM (8.2 GB vs 3.3 GB), but easily worth the speedup! Mash defaults are also k=21 and 1k sketches, so not a fair comparison to k=31 and 10k (not sure about sketches v buckets)
October 2, 2025 at 4:29 PM
No longer need to guess what the dorado bam tags mean!
July 18, 2025 at 10:01 AM
Any now his watch is ended.

Hopefully there will be a point release or something to get IGV loading cram3.1
May 10, 2025 at 10:06 AM
Hopefully htsjdk will have merged support for CRAM 3.1 around the same time. Recompressing 3.1 to 3.0/BAM for collaborators so they can view the alignments in IGV is a bit of a pain, and 3.1 being the default will probably lead to that issue more often.
March 27, 2025 at 10:54 AM