Nezar Abdennur
nvictus.bsky.social
Nezar Abdennur
@nvictus.bsky.social
computational biologist / biological computer / asst prof @UMassChan / phd @MIT / http://abdenlab.org
We anticipate that joint dimensionality reduction and projection will become a foundational norm for comparative and integrative analysis of long-range interaction profiles in Hi-C/3C+ data. e.g. existing methods for working with classic A/B vectors can be extended to joint higher-order embeddings.
August 11, 2025 at 8:41 PM
We jointly-hic to create an atlas of 89 human Hi-C samples, uncovering distinct patterns of nuclear architecture associated with heterochromatin composition and demonstrating how higher-order principal components capture missing information about gene expression and regulatory element activity.
August 11, 2025 at 8:41 PM
jointly-hic accomplishes this using mini-batch incremental PCA, allowing for joint decomposition of arbitrarily many contact matrices at any resolution with constant memory.
August 11, 2025 at 8:41 PM
Joint decomposition allows for robust and directly comparable low dimensional representations of arbitrarily many contact maps, providing insights into genome organization across diverse biological contexts, from different tissues to developmental stages.
August 11, 2025 at 8:41 PM
The classic A/B compartment track comes from matrix factorization of a contact matrix into eigenvectors or PCs. Done separately, each map is projected onto a different coordinate system. Comparing such vectors directly is problematic, especially if seeking info from **higher-order** components.
August 11, 2025 at 8:41 PM
We introduce a framework and Python toolkit (github.com/abdenlab/joi...) for analyzing compartmentalization and long-range interactions in chromosome conformation capture data.
GitHub - abdenlab/jointly-hic: Genomics research toolkit for jointly embedding Hi-C 3D chromatin contact matrices into the same vector space
Genomics research toolkit for jointly embedding Hi-C 3D chromatin contact matrices into the same vector space - abdenlab/jointly-hic
github.com
August 11, 2025 at 8:41 PM
Yes, and more recently Zarr too academic.oup.com/gigascience/...

While oxbow makes legacy data more accessible, it is a good conduit to more general-purpose persistent storage.
Analysis-ready VCF at Biobank scale using Zarr
AbstractBackground. Variant Call Format (VCF) is the standard file format for interchanging genetic variation data and associated quality control metrics.
academic.oup.com
July 9, 2025 at 7:02 AM
Reposted by Nezar Abdennur
(4) bpnet-lite: Load official Chrom/BPNet models into PyTorch for downstream tangermeme integration. Improved command-line tools + docs. Still concerns about perf of models trained from scratch -- will be resolved next version!

github.com/jmschrei/bpn...

bsky.app/profile/jmsc...
June 30, 2025 at 6:38 PM
We’re excited and eager for feedback, so please give oxbow a try!

`pip install oxbow`
July 7, 2025 at 9:22 PM
I’m also excited to be presenting Oxbow as part of my talk on composability at the #SciPy2025 Conference on Wednesday! Hope to see some of you there.

cfp.scipy.org/scipy2025/ta...
Breaking the silo: composable bioinformatics through cross-disciplinary open standards SciPy 2025
The practice of data science in genomics and computational biology is fraught with friction. This is in large part because bioinformatic tools tend to be tightly coupled to file input/output. As a res...
cfp.scipy.org
July 7, 2025 at 9:22 PM
It also supports:

* Column projection and pushdown (parsing only the fields you need)
* Complex and nested field types (e.g. alignment tags, variant genotype call data, etc.)
* Genomic range-based queries via an index
* User-defined transports and file systems
July 7, 2025 at 9:22 PM
This update (v0.4.x) provides complete #ApacheArrow data models for 11 file formats and counting, including the GA4GH/htslib formats and UCSC’s BigWig/BigBed.
July 7, 2025 at 9:22 PM
We revamped the #rustlang backend and implemented a new "DataSource" API in #Python, which allows for streaming conventional #genomic files – in-memory, on-disk, or in the cloud – into the modern data tools you use regularly, including #Pandas, #Polars, #DuckDB, and #Dask.
July 7, 2025 at 9:22 PM
We’re excited and eager for feedback, so please give oxbow a try!

`pip install oxbow`
July 7, 2025 at 9:18 PM
I’m also excited to be presenting Oxbow as part of my talk on composability at the #SciPy2025 Conference on Wednesday! Hope to see some of you there.

cfp.scipy.org/scipy2025/ta...
Breaking the silo: composable bioinformatics through cross-disciplinary open standards SciPy 2025
The practice of data science in genomics and computational biology is fraught with friction. This is in large part because bioinformatic tools tend to be tightly coupled to file input/output. As a res...
cfp.scipy.org
July 7, 2025 at 9:18 PM
It also supports:

* Column projection and pushdown (parsing only the fields you need)
* Complex and nested field types (e.g. alignment tags, variant genotype call data, etc.)
* Genomic range-based queries via an index
* User-defined transports and file systems
July 7, 2025 at 9:18 PM
This update (v0.4.x) provides complete #ApacheArrow data models for 11 file formats and counting, including the GA4GH/htslib formats and UCSC’s BigWig/BigBed.
July 7, 2025 at 9:18 PM
We revamped the #rustlang backend and implemented a new "DataSource" API in #Python, which allows for streaming conventional #genomic files – in-memory, on-disk, or in the cloud – into the modern data tools you use regularly, including #Pandas, #Polars, #DuckDB, and #Dask.
July 7, 2025 at 9:18 PM