David Martínez Millán
banner
dmartmillan.bsky.social
David Martínez Millán
@dmartmillan.bsky.social
I am an engineer in a biological world.
Thanks, Brendan! More than welcome to try it out!
December 12, 2024 at 2:59 PM
Thank you so much!!! ☺️
December 12, 2024 at 10:24 AM
I would like to thank everyone involved in the development of the tool, Federica Brando, Miguel L. Grau, @guixe-m.bsky.social , Carlos López-Elorduy, Iker Reyes-Salazar, Jordi Deu-Pons, @nlbigas.bsky.social and Abel González-Pérez.
December 12, 2024 at 10:11 AM
Overall, OpenVariant addresses a significant problem in the field by aggregating cohort-level data from multiple sources into a single harmonized result set. It replaces many of the tedious steps involved in curating data with a more robust and easier-to-document process.
December 12, 2024 at 10:11 AM
OpenVariant is open-source software under BSD-3 Clause license, freely available for public use. It is designed in an easily extendable way to encourage collaboration in its development, available on GitHub: github.com/bbglab/openv...
GitHub - bbglab/openvariant: Read, parse and operate different multiple input file formats with OpenVariant
Read, parse and operate different multiple input file formats with OpenVariant - bbglab/openvariant
github.com
December 12, 2024 at 10:11 AM
We integrated OpenVariant as the first step in the IntOGen pipeline (www.intogen.org), processing 257,898,749 somatic mutations across 33,218 tumor samples represented through 271 cohorts sequenced by different sources, and stored in different data formats.
December 12, 2024 at 10:11 AM
No existing tool matches OpenVariant's functionalities, setting it apart from other tools in the field. Its execution time was evaluated against similar Python-based tools using @brent-p.bsky.social benchmark (github.com/brentp/vcf-b...), ranking OpenVariant among the best peers.
GitHub - brentp/vcf-bench: evaluating vcf parsing libraries
evaluating vcf parsing libraries. Contribute to brentp/vcf-bench development by creating an account on GitHub.
github.com
December 12, 2024 at 10:11 AM
OpenVariant is designed based on an annotation structure that serves as a core component in which describes how input files are parsed and how the output is represented. As well, a plugin system is incorporated to hone data transformation from the user.
December 12, 2024 at 10:11 AM
We present OpenVariant, a Python package to encompass a wide range of functionalities to operate multiple variant file formats at once and manage the annotation of metadata relative to mutational datasets. You can consult the documentation at: openvariant.readthedocs.io
OpenVariant documentation — OpenVariant
openvariant.readthedocs.io
December 12, 2024 at 10:11 AM
Despite efforts to homogenize data produced by variant callers and available processing tools, differences in the variants persist across projects. This variability hiders the integration of somatic mutations from different sources, key for large cancer genomics analyses.
December 12, 2024 at 10:11 AM