Lightnews — Scholar-powered news

Gherman Novakovsky

@gnovakovsky.bsky.social

Yes, that's exactly what it is. Predicting the difference here is important.

June 10, 2025 at 4:40 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

This ensures the model focuses on the actual variant and doesn't overfit to correlated but irrelevant features, which leads to a better generalization.

June 10, 2025 at 6:47 AM

Gherman Novakovsky

@gnovakovsky.bsky.social

Certainly! Here the entire model is shared, two copies see inputs that differ only at a single base pair (a variant of interest), and the model weights are tuned to learn the difference in effect size correctly.

June 10, 2025 at 6:47 AM

Gherman Novakovsky

@gnovakovsky.bsky.social

Great question! That's our best guess as well and we highlight this in the paper by saying that MPRA experimental data from individual cell lines could have limitations for variant interpretation.

June 10, 2025 at 6:46 AM

Gherman Novakovsky

@gnovakovsky.bsky.social

Huge thanks to the amazing Illumina team—this was an incredible learning experience! I'm excited to keep pushing forward as we develop models to tackle gene expression and non-coding variant interpretation. (16/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

A complementary thread from my colleague Kishore Jaganathan ‪@kjaganatha.bsky.social‬ bsky.app/profile/kjag... (15/)

kjaganatha.bsky.social @kjaganatha.bsky.social · May 29

We're thrilled to introduce PromoterAI — a tool for accurately identifying promoter variants that impact gene expression. 🧵 (1/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

Want to learn more about PromoterAI?
📄 Read the paper: science.org/doi/10.1126/...
💻 Explore the code & precomputed scores: github.com/Illumina/Pro.... (14/)

Predicting expression-altering promoter mutations with deep learning

Only a minority of patients with rare genetic diseases are currently diagnosed by exome sequencing, suggesting that additional unrecognized pathogenic variants may reside in non-coding sequence. Here,...

science.org

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

We followed up by testing promoter variants in Mendelian genes using MPRA. Surprisingly, PromoterAI was more effective than MPRA at prioritizing variants linked to patient phenotypes, highlighting limitations of MPRA for rare disease interpretation. (13/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

While we noticed that the use of additional species such as mouse does not lead to substantial improvement of variant effect prediction, it does help with ensembling. Thus, the final model is an ensemble of two: trained on human only and trained on mouse+human together. (12/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

In the Genomics England rare disease cohort, functional promoter variants predicted by PromoterAI were enriched in phenotype-matched Mendelian genes. These variants accounted for an estimated 6% of the rare disease genetic burden. (11/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

In the UK biobank cohort, PromoterAI's predicted promoter variant effects correlated strongly with measured protein levels and quantitative traits, suggesting that promoter variants contribute meaningfully to phenotypic variation in the general population. (10/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

PromoterAI's embeddings split promoters into three distinct classes: P1 (~9K genes, ubiquitously active), P2 (~3K genes, bivalent chromatin), E (~6K genes, enhancer-like). The E class, enriched for TATA boxes, may reflect enhancers co-opted as promoters. (9/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

Fine-tuning improved PromoterAI’s ability to predict the direction of motif effects — a known issue of multitask models. The model often recognized motifs before fine-tuning, but got the direction wrong. After fine-tuning, its predictions aligned better with the data. (8/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

We used our list of gene expression outliers to explore their effect on transcription factor binding sites. Our results show that it is easier for new variants to cause outlier gene expression by disrupting existing regulatory components rather than creating new ones. (7/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

We also attempted to fine-tune Enformer and Borzoi on our promoter variant set. While performance improved, both models lagged behind PromoterAI. Notably, PromoterAI outperformed Enformer and was similar to Borzoi before fine-tuning. (6/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

When it comes to predicting expression effects of promoter variants, PromoterAI achieved best performance across benchmarks spanning RNA, proteins, QTLs, and MPRA. (5/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

The second step was to fine-tune the model using a carefully curated list of rare promoter variants linked to aberrant gene expression. The fine-tuning was done using a twin-network setup to ensure the generalization across unseen genes and datasets. (4/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

First, we pre-trained PromoterAI to predict histone marks, TF binding, DNA accessibility, and CAGE signal from a genomic sequence. The key difference with models like Enformer and Borzoi is that we predict at a single base-pair resolution and use only TSS-centered regions. (3/)

May 29, 2025 at 11:57 PM

Gherman Novakovsky

@gnovakovsky.bsky.social

PromoterAI is built from transformer-inspired blocks called metaformers — but instead of attention, we use depthwise convolutions, making it a fully convolutional model. We believe that CNN-based methods are not surpassed yet and remain a great choice for genomics tasks. (2/)

May 29, 2025 at 11:57 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news