Amanda Clare
banner
amandaclare.bsky.social
Amanda Clare
@amandaclare.bsky.social
Senior lecturer. Metagenomics, bioinformatics, data science, databases, error analyses and writing code. Aberystwyth, UK. she/her.
Orcid: 0000-0001-8315-3659
Anyway, very useful as a distance between two probability distributions. en.wikipedia.org/wiki/Wassers...
Wasserstein metric - Wikipedia
en.wikipedia.org
December 2, 2025 at 6:30 PM
... "Wasserstein metric" in literature, so I had felt awkward in using the other name. He asked if I'd found out that it was actually Kantorovich on Wikipedia. I said yes, about a year ago. He said it was he and his student who made the naming updates to that Wikipedia page, about two years ago. 2/2
December 2, 2025 at 6:18 PM
And welcome @martinjvickers.bsky.social to Bluesky!
December 1, 2025 at 7:27 PM
It was really fun and hopeful, describing the potential for a better future with @nickdimonaco.bsky.social (QUB) and Martin Vickers (JIC). Although bioRxiv doesn't accept review paper preprints, here it is on Figshare figshare.com/articles/pre... 6/6
Genome Assemblies and Annotations Are Not Static and Need Support for Tracking Their Evolution
For the past 25 years, genomic data has been distributed in two key file formats, FASTA and GFF. These files are used across nearly all genomic analyses and encode both the data of genomic sequences ...
figshare.com
December 1, 2025 at 7:07 PM
We examine the limitations of genome file formats, demonstrate why incremental improvements are insufficient, and argue that genomics must adopt version control with the same gusto that is applied to generating new sequencing data. 5/6
December 1, 2025 at 7:07 PM
The informality paradox: FASTA and GFF are simple, flexible and incredibly successful as file formats. But the features that made them successful make them incompatible with the systematic change tracking that modern collaborative genomics desperately needs. 4/6
December 1, 2025 at 7:07 PM
It doesn't yet exist but we've summarised where we are now and set out our thoughts on what would be great to have. Could we have propagation of sequence updates to annotations? support for conflict resolution? biologically meaningful genome diffs? alternatives in branches? 3/6
December 1, 2025 at 7:07 PM
You've downloaded a FASTA and a GFF and found that the identifiers don't match? You know there's a new assembly online but it's going to mess up your carefully curated annotations? You're working in an international pangenome consortium and have assemblies and annotations galore? 2/6
December 1, 2025 at 7:07 PM
The nibbles in there today though were not so healthy.
November 28, 2025 at 5:19 PM
Reposted by Amanda Clare
We’re working on software that tracks changes to genomic DNA sequences, both intentional and the actual results from real-life sequencing. We wrote up a blog post describing how that could work for a bacterial genome: www.genhub.bio/blog/2025/10... (2/5)
GenHub
www.genhub.bio
November 5, 2025 at 11:16 PM