Nic Fishman
njw.fish
Nic Fishman
@njw.fish
Using computers to read the entrails of modernity (statistics, optimization, machine learning), focused on applications in the social and biological sciences.

Currently: Stats PhD @Harvard
Previously: CS/Soc @Stanford, Stat/ML @Oxford

https://njw.fish
8/ 🧬 Synthetic promoter design
Sequences binned by expression → GDEs provide a powerful embedding space for regulatory sequence design.

🦠 SARS-CoV-2 spike protein dynamics
GDEs embed monthly lineage distributions and recover smooth latent chronologies.
May 26, 2025 at 3:52 PM
7/ 🧫 Morphological profiling (20M images)
GDEs model phenotype distributions induced by perturbations and generalize to unseen conditions.

🧬 DNA methylation (253M reads)
We learn tissue-specific methylation patterns directly from raw bisulfite reads — no alignment necessary.
May 26, 2025 at 3:51 PM
6/ 🧬 scRNA-seq lineage tracing
Each clone is a population of cells → a distribution over expression. GDEs predict clonal fate better than prior approaches.

🧪 CRISPR perturbation effects
GDEs can improve zero-shot prediction of transcriptional response distributions.
May 26, 2025 at 3:51 PM
5/ GDEs are built to scale: at inference, you can embed or generate from large samples without retraining.

We show that GDE embeddings are asymptotically normal, grounding this with results from empirical process theory.
May 26, 2025 at 3:51 PM
4/ With this simple setup, we find surprisingly elegant geometry:

GDE latent distances track Wasserstein-2 (W₂) distances across modalities (shown for multinomial distributions)

Latent interpolations recover optimal transport paths (shown for Gaussians and Gaussian mixtures)
May 26, 2025 at 3:50 PM
🚨 New preprint 🚨

We introduce Generative Distribution Embeddings (GDEs) — a framework for learning representations of distributions, not just datapoints.

GDEs enable multiscale modeling and come with elegant statistical theory and some miraculous geometric results!

🧵
May 26, 2025 at 3:49 PM
Another key insight: Not all fairness metrics are created equal. Our sensitivity analysis shows how different metrics respond to measurement biases - some are surprisingly fragile, others more robust. 3 / 5
December 12, 2024 at 5:08 PM
Our research reveals a critical challenge: real-world datasets often have multiple measurement errors, and small measurement errors can completely change fariness analysis. We analyzed 14 benchmark datasets to understand these complex interactions. 2 / 5
December 12, 2024 at 5:07 PM
Excited to share our new work on causal sensitivity analysis for fairness metrics at #NeurIPS2024! We've developed a causal sensitivity analysis framework to understand how underlying measurement biases (encoded by DAGs) impact machine learning fairness evaluations. 1 / 5
December 12, 2024 at 5:07 PM
Okay here’s my best shot at the core ideas:
October 6, 2023 at 3:43 PM