Jacob Schreiber
banner
jmschreiber91.bsky.social
Jacob Schreiber
@jmschreiber91.bsky.social
Studying genomics, machine learning, and fruit. My code is like our genomes -- most of it is junk.

Assistant Professor UMass Chan, Board of Directors NumFOCUS
Previously IMP Vienna, Stanford Genetics, UW CSE.
Why is it a problem to translate C++ to changng standards when you can just use Agenic AI?
October 31, 2025 at 7:01 PM
they told the flight attendants to sit down for the second half of the flight (an hour) because it was entirely turbulence
October 13, 2025 at 9:43 PM
I figure I can spend the time until I get tenure on answering the question, and then the time after tenure arguing about what "regulatory" means
October 7, 2025 at 3:37 PM
If you're interested, please reach out with your CV and which topics you'd be interested in working on!
October 7, 2025 at 3:28 PM
- Genomics Software Ecosystem: A major obstacle to our goal is the lack of simple+scalable software that everyone can use. Come build this with me. Training a lightweight deep learning model and using it for design/interpretability/VE prediction should be no more challenging than mapping reads.
October 7, 2025 at 3:28 PM
- Foundation Models: As someone involved in ML, I am legally required to be working on this topic.
October 7, 2025 at 3:26 PM
We have an array of ML-based projects for going after this, focusing on the following topics:

- DNA Design ( 🧬 ) We have shown that Ledidi (www.biorxiv.org/content/10.1...) can precisely design DNA, and now it's time to push the boundaries in several directions w/ some very cool collaborations.
Programmatic design and editing of cis-regulatory elements
The development of modern genome editing tools has enabled researchers to make such edits with high precision but has left unsolved the problem of designing these edits. As a solution, we propose Ledi...
www.biorxiv.org
October 7, 2025 at 3:26 PM
It was suggested that the audience may not appreciate/understand :(
October 3, 2025 at 6:22 AM
Thanks! Let me know if you want me to stop in virtually, we can try to figure out a time.
August 27, 2025 at 5:15 PM
Hope you find tangermeme helpful in your work! Please reach out if you have any comments + questions.
August 27, 2025 at 4:44 PM
Because everything is automatic, we can probe models.

What motifs are driving model predictions? Calculate attributions, call + annotate seqlets, and count the annotations!

BPNet is relying on MYC, whereas Beluga is relying on many more TFs. Easy comparison now.
August 27, 2025 at 4:43 PM
Frequently, people manually annotate seqlets and draw bars or boxes around these high-attribution characters themselves. This is not really a problem, but it's just slow and does not scale genome-wide.

In the above picture, everything is automatically done.
August 27, 2025 at 4:41 PM
People *talk* about seqlets a lot but tangermeme is the first package for complete functionality.

Here is a complete example of using tangermeme for attributions, seqlet calling + annotation, and plotting, to visualize what five models think of the same locus
August 27, 2025 at 4:40 PM