John Doench
banner
johndoench.bsky.social
John Doench
@johndoench.bsky.social
Functional genomics @ Broad Institute. Screen all the things!
We conclude with a comparison of analysis methods, including a cautionary tale of false positives arising from the use of MAGeCK RRA.

We provide suggestions to "mitigate the reporting of false discoveries that, if left unchecked, would undermine confidence in large-scale perturbational screens."
September 7, 2025 at 5:02 PM
We next screened this library and assessed performance via separation of essential and non-essential genes - we see much greater depletion of the former while the distribution of the latter barely budges. So we accomplished our goal: smarter picking of guides increased on-target activity.
September 7, 2025 at 5:02 PM
Essentially, by worrying less about off-target activity (because we actually do a pretty good job of predicting it) we're able to design a library with a *much* higher distribution of on-target scores, as assessed by Rule Set 3.
September 7, 2025 at 5:02 PM
A lot of details I won't cover here, but essentially we filter out all guides with a CRISPick Aggregate CFD score > 4.8 and then select the 3 guides with the best Rule Set 3 score. This new picking scheme leads to a substantial difference between Jacquere and our prior Brunello+Gattinara libraries
September 7, 2025 at 5:02 PM
We then examined three sources of "what are all the genes" and designed Jacquere to target all of them.
September 7, 2025 at 5:02 PM
We then used the F1 score to determine the sum of CFD scores that best-distinguished truly-promiscuous from false-positive off-target guides.

We found the best performance if we considered only 1 mismatch in the SDR, at a CFD sum of 4.8, so we define that as our "CRISPick Aggregate CFD" threshold
September 7, 2025 at 5:02 PM
We then asked how well a simple aggregation of CFD scores (as originally proposed in the development of GuideScan, PMID: 28263296) could predict the "promiscuity" of guides, defined as "guides that shouldn't deplete in a viability screen but do anyway" using tiling data of non-essential genes
September 7, 2025 at 5:02 PM
Using the Cutting Frequency Determination (CFD) matrix for predicting off-target sites (PMID: 26780180), when we limit to 2 or fewer mismatches in the SDR, the predictions match the measured GuideSeq off-targets sites remarkably well (here, each dot represents a bin of 37 off-target sites)
September 7, 2025 at 5:02 PM
Key detail: we're not using the entire guide in this search, we're using the Specificity Defining Region (SDR) which gives a "free pass" to any mismatches in the first 3 nucleotides of the guide, as that rarely contributes to specificity. This speeds up the genome-wide scan substantially.
September 7, 2025 at 5:02 PM
We see that, unsurprisingly, the more mismatches one allows to the guide sequence, the more potential off-target sites one computationally identifies, but the fraction that are actually active in the GuideSeq data plunges dramatically.
September 7, 2025 at 5:02 PM
I searched "Kraft Mayor" and this is what came up.

Good to know a search engine thinks a jar of mayonnaise is a better fit for mayor of my city than some nepo baby.
June 3, 2025 at 11:26 PM
It has come to my attention that we do not have a CRISPR library that targets the Marshmallow Fluff genome. Getting to work on that asap!
May 14, 2025 at 4:43 PM
I cannot be more excited 😊
May 10, 2025 at 11:08 PM
I guess I owe George Lucas an apology
April 9, 2025 at 11:05 AM
March 19, 2025 at 3:49 PM
@politicalwire.com Wow, just look at the current 'most popular' headlines... 9 of 10 start with 'Trump' ... and the other ends with 'Trump' ... certainly proves the point of Trump is Making Himself Inescapable!
February 10, 2025 at 11:06 PM
Second, looking at off-target effects of CRISPRi: "Use of multiple guides per gene when performing a CRISPRi/a
screen should continue to be a standard practice to reduce false positives and to increase confidence in true hits..."
January 6, 2025 at 7:18 PM
Two recent-ish papers provide good reminders of timeless lessons. 1) From @vidigaljoana.bsky.social: "the process of clonal expansion alone resulted in levels of clonal variation extensive enough to essentially preclude us from determining the functional impact of [miRNA] binding site disruption"
January 6, 2025 at 7:18 PM
What is 54 million light-years away from us? According to the Hubble, this funky thing known as the spiral galaxy NGC 4689, in the constellation Coma Berenices. This is "relatively nearby for a galaxy" apparently.
esahubble.org/images/potw2...
December 8, 2024 at 9:52 PM
For context, our galaxy, the Milky Way is "only" 100,000 light-years wide. So human DNA could zip back and forth across the Milky Way hundreds of times.
December 8, 2024 at 9:52 PM
And how many cells does a human have? This number is more of a range across people than a specific answer, but about 30 trillion (3x10^13) is the general consensus (source: Wikipedia, and confirmed via the primary literature; cartoon from XKCD, where else?) imgs.xkcd.com/comics/cell_...
December 8, 2024 at 9:52 PM
Next, how many basepairs do we have in a human cell? While the number has been refined over time, the most recent T2T (telomere-to-telomere) genome assembly puts the number just over 3 billion (3.055 x 10^9). Of course, we are a diploid genome, so multiply that number by 2 for a per-cell content.
December 8, 2024 at 9:52 PM
A little fun with numbers on this Sunday, showing just how successful DNA really is. First, here's what DNA looks like. I'll draw your attention to the "rise" in DNA, that is, the distance between two basepairs, which is 3.4 Angstroms, also known as 3.4x10^-10 meters...
December 8, 2024 at 9:52 PM
Still gets placed high up on the ol’ tree!
December 7, 2024 at 12:47 AM
It is not a "quick question" simply because it was quick for you to ask it. It must also be quick for the receiver to answer it. This is very much an under-appreciated asymmetry.

And no, I did not actually hit send, as much as I wanted to.
December 2, 2024 at 2:14 PM