sliverdaemon.bsky.social
@sliverdaemon.bsky.social
"Then how should I begin
To spit out all the butt-ends of my days and ways?
And how should I presume?" - TS Eliot, Love Song of J. Alfred Prufrock
I like ML, quantitive biology, and medicine.
slides for those interested: www.auai.org/uai2025/tuto...
July 22, 2025 at 2:12 AM
using interpretability methods to extract biological hypotheses hidden in the model and then validating the most promising ones in the wet lab. some datasets are strong enough to establish the existence of a mechanism, but observing the mechanism in action will ultimately be needed to settle debates
July 19, 2025 at 6:06 PM
where the authors examine the model gradients (cf. "Identifying important regions and regulators" to guess which TF motifs interact with genes. From Fig1 of the paper, it's clear the model is inacurrate in many scenarios, and accurate in just as many. but they found a new biological fact from it
July 19, 2025 at 6:01 PM
(stops *at benchmarks) ... rarely applies interpretability to extract possible mechanisms (causal relationships), ie biological truths, that the model has learned to work in certain contexts. a good example of this interpretability is (www.nature.com/articles/s41...)
July 19, 2025 at 5:48 PM
"causal" has several technical definitions, eg referenced authors' vs your own, and considering only one to be a appropriate works for debating, but ignores other fields of study. however, this thread made me realize that ML culture stops and benchmarks, and rarely applies interpretability
July 19, 2025 at 5:45 PM
chopper meme <3
July 4, 2025 at 1:53 AM
sorry, thanks for correcting. causal methods do exist for deconfounding of observational data (eg propensity matchng), but agree that interventions (eg crispr screens) are preferable, best is RNA + ATAC (www.nature.com/articles/s41...) tho can RNA also b informative (www.nature.com/articles/s41...)
July 2, 2025 at 9:36 PM
Reposted
I am once again reminded of my favorite @edyong209.bsky.social quote:

“…a tiny wise decision can do exponential good.”
June 28, 2025 at 4:09 PM
disagree that trans reg. models always end up just doing weighted nearest neighbors & they do not learn biology, but agree current cis models generalize better. perturbation data is not sequences but still is biology. for me GRN models a "virtual cell" more directly than sequence (my bias). 💯thead
June 28, 2025 at 10:03 PM
authors say bc genetic perturbation had smaller dataset:
"performance gains over mean baselines were notably larger on Tahoe-100Million[...] and Parse-PBMC, which includes 10 million [...], as compared to [...] genetic perturbation datasets conducted in just a few cell lines." about 3 million cells
June 26, 2025 at 2:24 AM