Sarah Gurev
sarahgurev.bsky.social
Sarah Gurev
@sarahgurev.bsky.social
Postdoc @ Debbie Marks Lab, Harvard | Prev. PhD @ MIT EECS || ML for Proteins + Viruses 🦠
EVEREST highlights:
✅ Where models fail—and why
✅ Which viruses are least/most predictable
✅ How to estimate per-protein, model-specific reliability
✅ Concrete steps to improve ML for viral mutation prediction
9/12
August 17, 2025 at 3:42 AM
🌍Current models fail to reliably predict mutations in more than half of the high-priority viruses identified by the WHO.
8/12
August 17, 2025 at 3:42 AM
💪Is bigger always better? Maybe not for other taxa but for viruses - yes! For viruses, models continue to improve with increased numbers of parameters.
7/12
August 17, 2025 at 3:42 AM
🤏Why? Viruses are severely underrepresented in training datasets (<1%) and are further downsampled after common clustering approaches.
6/12
August 17, 2025 at 3:42 AM
📉Despite the hype, protein language models trained across the “protein universe” are outperformed by even the simplest, site-independent alignment-based model.
5/12
August 17, 2025 at 3:42 AM
🚨New paper 🚨

Can protein language models help us fight viral outbreaks? Not yet. Here’s why 🧵👇
1/12
August 17, 2025 at 3:42 AM