Jakub Vašíček
banner
jvasicek.bsky.social
Jakub Vašíček
@jvasicek.bsky.social
PhD student in bioinformatics @ University of Bergen
And after you've done your search, you can feed your results to this little pipeline, and it will tell you who these peptides are: github.com/ProGenNo/Pro...

Here's a flowchart overview :))
December 11, 2024 at 2:08 AM
Here's an overview of how many new peptides you gain in each of the databases. If you're looking for a database best suited to Europeans, then I'd go for the 1000 Genomes European superpopulation here: zenodo.org/records/12671237 (file 240530_ProHap_EUR.tar.gz)
December 11, 2024 at 1:50 AM
Thanks so much for the shout-out! That's great to see you're finding this useful!

A little note on which of the fastas to use - the HRC data we got access to was aligned with an older genome build, and many variants were lost during liftover. The 50k+ variants are in the 1000 Genomes databases.
December 11, 2024 at 1:47 AM
Please see the GitHub project page for all relevant info on usage, availability of the databases we have created, etc.: github.com/ProGenNo/Pro...
GitHub - ProGenNo/ProHap: Proteogenomics database-generation tool for protein haplotypes and variants
Proteogenomics database-generation tool for protein haplotypes and variants - ProGenNo/ProHap
github.com
December 10, 2024 at 3:53 AM
Thanks to @lukaskall.bsky.social, @njolstad.bsky.social and other co-authors that aren't here (yet) for their contribution!
December 10, 2024 at 3:43 AM