Jeremy Schwartzentruber
@jeremy37.bsky.social
Scientist at Illumina using AI methods to interpret the non-coding genome and empower new genetic association discoveries.
We trained scores for ~60 quantitative traits in 256k UKB individuals, and found that these had higher correlation with values in the 64k test set than models based on raw variant annotations did. Individuals with high FlexRV-PRS were also more enriched for outlier trait values.
November 5, 2025 at 10:29 PM
We trained scores for ~60 quantitative traits in 256k UKB individuals, and found that these had higher correlation with values in the 64k test set than models based on raw variant annotations did. Individuals with high FlexRV-PRS were also more enriched for outlier trait values.
I visualized all of the variant score / MAF weights as a heatmap, counting the number of times each weight transformation had the lowest p value. Interestingly, highly constrained genes (S_het > 0.05) more often benefit from placing weight on rarer, highly deleterious variants.
November 5, 2025 at 10:29 PM
I visualized all of the variant score / MAF weights as a heatmap, counting the number of times each weight transformation had the lowest p value. Interestingly, highly constrained genes (S_het > 0.05) more often benefit from placing weight on rarer, highly deleterious variants.
Why is this approach so effective? I think it comes down to the fact that the “annotation → trait” mapping is often nonlinear, and importantly - is different for each gene. Here are a few examples.
November 5, 2025 at 10:29 PM
Why is this approach so effective? I think it comes down to the fact that the “annotation → trait” mapping is often nonlinear, and importantly - is different for each gene. Here are a few examples.
We also checked whether the FlexRV gene-based associations are enriched for proximity to GWAS hits (which are better powered but don’t give the causal gene directly) or have high PoPS scores (locus-independent GWAS signal), and found better enrichments than other methods.
November 5, 2025 at 10:29 PM
We also checked whether the FlexRV gene-based associations are enriched for proximity to GWAS hits (which are better powered but don’t give the causal gene directly) or have high PoPS scores (locus-independent GWAS signal), and found better enrichments than other methods.
We used multiple approaches to check whether these associations are real. We ran FlexRV in 200k UKB individuals and checked whether these replicated in the reported DeepRVAT results on the full cohort - and found that they replicated at a higher rate than Regenie or STAAR.
November 5, 2025 at 10:29 PM
We used multiple approaches to check whether these associations are real. We ran FlexRV in 200k UKB individuals and checked whether these replicated in the reported DeepRVAT results on the full cohort - and found that they replicated at a higher rate than Regenie or STAAR.
For example, DeepRVAT is a very cool deep learning method that combines many variant annotations together into a “gene impairment score” for each individual. But for the 28 quantitative traits we tested in UKB, FlexRV found 37% more associations (and 58% more for binary)!
November 5, 2025 at 10:29 PM
For example, DeepRVAT is a very cool deep learning method that combines many variant annotations together into a “gene impairment score” for each individual. But for the 28 quantitative traits we tested in UKB, FlexRV found 37% more associations (and 58% more for binary)!
Each set of weights is a hypothesis about the possible relationship between variants and their effects.
Like STAAR, we combine p-values from these tests together with the Cauchy combination test (CCT). We were surprised by just how much this improved power over other approaches.
Like STAAR, we combine p-values from these tests together with the Cauchy combination test (CCT). We were surprised by just how much this improved power over other approaches.
November 5, 2025 at 10:29 PM
Each set of weights is a hypothesis about the possible relationship between variants and their effects.
Like STAAR, we combine p-values from these tests together with the Cauchy combination test (CCT). We were surprised by just how much this improved power over other approaches.
Like STAAR, we combine p-values from these tests together with the Cauchy combination test (CCT). We were surprised by just how much this improved power over other approaches.
Similar to STAAR we ran many burden/SKAT tests with different weights. But in our case, we used a single high-performing variant effect predictor (as well as MAF), which we transformed to weights based on plausible relationships between score and biological effect.
November 5, 2025 at 10:29 PM
Similar to STAAR we ran many burden/SKAT tests with different weights. But in our case, we used a single high-performing variant effect predictor (as well as MAF), which we transformed to weights based on plausible relationships between score and biological effect.
We thought that the relationship between variant effect predictions and traits might be nonlinear. Indeed, just visualizing PrimateAI3D scores vs. various measurements in UK Biobank shows you it can be nonlinear. The same is true for AlphaMissense and other scores.
November 5, 2025 at 10:29 PM
We thought that the relationship between variant effect predictions and traits might be nonlinear. Indeed, just visualizing PrimateAI3D scores vs. various measurements in UK Biobank shows you it can be nonlinear. The same is true for AlphaMissense and other scores.