Yana Bromberg
yanabromberg.bsky.social
Yana Bromberg
@yanabromberg.bsky.social
Bioinformatics geek. Prof @Emory, CS and Biology. Learning protein-ish and deciphering the DNA blueprint of life
While our focus is on protein embeddings, the same intuition applies to any domain using latent representations, from medical imaging to physics.
We are looking forward to hearing how your models improve with RNS!
May 12, 2025 at 2:29 PM
Main findings:
• Variant effect prediction accuracy jumped from ~60% to ~90% for low vs high-reliability embeddings.
• Hundreds to thousands of human proteins, per model, may be poorly captured.
• Our score flags low-reliability embeddings, guiding better model training and downstream fine-tuning.
May 12, 2025 at 2:29 PM
I’m super excited to share our work (with Prabakaran Ramakrishnan) on scoring embedding reliability. We propose the RNS (random neighbors) score that improves the next steps in model use, e.g. variant effect prediction, structure modeling, function annotation, etc.
www.biorxiv.org/content/10.1...
Quantifying uncertainty in Protein Representations Across Models and Task
Embeddings, derived by language models, are widely used as numeric proxies for human language sentences and structured data. In the realm of biomolecules, embeddings serve as efficient sequence and/or...
www.biorxiv.org
May 12, 2025 at 2:29 PM
Love all of it! The big question is are you on this side of the pond yet?
December 24, 2024 at 8:48 AM