Leo Zang
banner
leozang.bsky.social
Leo Zang
@leozang.bsky.social
Protein Designer | Share Reading Notes (AI+Protein/RNA/DNA)
www.leozang.com
Pinned
Collection of Papers and Posts
www.leozang.com/Paper-Collec...
Computational protein design
- "This Primer provides an introduction to the main approaches in computational protein design, covering both physics-based and machine-learning-based tools. It aims to be accessible to biological, physical and computer scientists alike."
www.nature.com/articles/s43...
March 5, 2025 at 9:14 PM
Protein-Based Degraders: From Chemical Biology Tools to Neo-Therapeutics
- "we provide a comprehensive and critical review of studies that have used proteins and peptides to mediate the degradation and hence the functional control of otherwise challenging disease-relevant protein targets.
January 30, 2025 at 5:33 PM
Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review
arxiv.org/abs/2501.09685
January 23, 2025 at 3:41 AM
Targeting protein–ligand neosurfaces with a generalizable deep learning tool | @Nature
- MaSIF-neosurf can design binders for protein-ligand complexes, targeting neosurfaces (i.e., ligand-induced structural changes on the protein surface)
Link: www.nature.com/articles/s41...
Targeting protein–ligand neosurfaces with a generalizable deep learning tool - Nature
A computational deep learning approach is used to design synthetic proteins that target the neosurfaces formed by protein–ligand interactions, with applications in the development of new therapeutic m...
www.nature.com
January 18, 2025 at 6:53 AM
Massively parallel characterization of transcriptional regulatory elements
- Develope an optimized lentiMPRA (lentiviral massively parallel reporter assay) method to test regulatory activity of >680,000 sequences across three cell types (HepG2, K562, WTC11)
Link: www.nature.com/articles/s41...
January 17, 2025 at 7:12 AM
DNALONGBENCH: A Benchmark Suite for Long-Range DNA Prediction Tasks
www.biorxiv.org/content/10.1...
Engineering of CRISPR-Cas PAM recognition using deep learning of vast evolutionary data
www.biorxiv.org/content/10.1...
Collection of Papers and Posts
www.leozang.com/Paper-Collec...
January 9, 2025 at 9:58 PM
A review of deep learning models for the prediction of chromatin interactions with DNA and epigenomic profiles | @BriefingBioinfo
Link: academic.oup.com/bib/article/...
December 27, 2024 at 5:05 AM
Leveraging ancestral sequence reconstruction for protein representation learning
www.nature.com/articles/s42...
Guiding Generative Protein Language Models with Reinforcement Learning
arxiv.org/abs/2412.12979
Collection of Papers and Posts
www.leozang.com/Paper-Collec...
December 19, 2024 at 1:42 AM
Harnessing the biology of regulatory T cells to treat disease
- "This Review will discuss recent advances in our understanding of human Treg cell biology, with a focus on mechanisms of action and strategies to assess outcomes of Treg cell-targeted therapies."
www.nature.com/articles/s41...
Harnessing the biology of regulatory T cells to treat disease - Nature Reviews Drug Discovery
Regulatory T cells keep the immune system in check to maintain homeostasis and restrain inflammation. This Review discusses strategies to harness these cells therapeutically for autoimmunity, transpla...
www.nature.com
December 16, 2024 at 7:32 PM
Annotation-guided Protein Design with Multi-Level Domain Alignment
arxiv.org/abs/2404.16866
BEACON: Benchmark for Comprehensive RNA Tasks and Language Models
arxiv.org/abs/2406.10391
Collection of Papers and Posts
www.leozang.com/Paper-Collec...
December 16, 2024 at 3:04 AM
mRNA m6A detection | @MethodsPrimers
- "This Primer outlines the available tools for detecting and mapping m6A, discusses the strengths and limitations of each method and offers guidance on selecting the most suitable approach."
www.nature.com/articles/s43...
mRNA m6A detection - Nature Reviews Methods Primers
N6-methyladenosine (m6A) is an mRNA modification influencing gene expression. Advanced methodologies for mapping m6A enhance understanding of its dynamic roles and interactions. In this Primer, Moshit...
www.nature.com
December 15, 2024 at 7:54 PM
Concept Bottleneck Language Models For protein design
- Introduce CB-pLM (Concept Bottleneck Protein Language Models) from 24M to 3B, trained on UniRef50 and SwissProt over 718 concepts (including Cluster name, Biological process, and Biopython-derived features, etc.)
arxiv.org/abs/2411.06090
December 14, 2024 at 10:29 PM
Benchmarking recent computational tools for DNA-binding protein identification
- "we conduct an unbiased benchmarking of 11 state-of-the-art computational tools as well as traditional tools such as ScanProsite, BLAST, and HMMER for identifying DBPs."
Link: academic.oup.com/bib/article/...
Benchmarking recent computational tools for DNA-binding protein identification
Abstract. Identification of DNA-binding proteins (DBPs) is a crucial task in genome annotation, as it aids in understanding gene regulation, DNA replicatio
academic.oup.com
December 12, 2024 at 4:14 AM
Comprehensive prediction and analysis of human protein essentiality based on a pretrained large language model | @NatComputSci
- PIC (Protein Importance Calculator), an ESM2-based deep learning model, predicts protein essentiality across three biological levels
Link: www.nature.com/articles/s43...
Comprehensive prediction and analysis of human protein essentiality based on a pretrained large language model - Nature Computational Science
This study introduces the Protein Importance Calculator (PIC), a deep learning model designed to predict human essential proteins (HEPs) crucial for survival and development. Unlike conventional metho...
www.nature.com
November 27, 2024 at 10:54 PM
Using artificial intelligence to document the hidden RNA virosphere
- PRIME, protein language model (same as ESM-2 650M architecture) pretrained on 96 million sequences with optimal growth temperatures (OGTs annotated by [1]) with MLM, MSE, and Correlation Loss
Link: www.science.org/doi/10.1126/...
November 27, 2024 at 10:30 PM
Getting aligned on representational alignment
- "In this Perspective, we survey the exciting recent developments in representational alignment research in the fields of cognitive science, neuroscience, and machine learning"
Link: arxiv.org/abs/2310.13018
November 27, 2024 at 9:49 PM
More:
- BindingDB in 2024: a FAIR knowledgebase of protein-small molecule binding data
academic.oup.com/nar/advance-...
- BFVD—a large repository of predicted viral protein structures
academic.oup.com/nar/advance-...
November 25, 2024 at 5:29 AM
Reposted by Leo Zang
Our Big Fantastic Virus Database (BFVD) is now published NAR! It contains protein structure predictions of major viral clades, enhanced by petabase-scale homology search and it's explorable on the web.
🌐 bfvd.foldseek.com
💾 bfvd.steineggerlab.workers.dev
📄 academic.oup.com/nar/advance-...
November 23, 2024 at 9:12 PM
Discovery and significance of protein-protein interactions in health and disease | @cellpressnews.bsky.social Review
Link: www.cell.com/cell/fulltex...
November 21, 2024 at 5:03 AM
Database updates
-The Pfam protein families database: embracing AI/ML
academic.oup.com/nar/advance-...
- UniProt: the Universal Protein Knowledgebase in 2025
academic.oup.com/nar/advance-...
- RASP v2.0: an updated atlas for RNA structure probing data
academic.oup.com/nar/advance-...
UniProt: the Universal Protein Knowledgebase in 2025
Abstract. The aim of the UniProt Knowledgebase (UniProtKB; https://www.uniprot.org/) is to provide users with a comprehensive, high-quality and freely acce
academic.oup.com
November 21, 2024 at 4:59 AM
InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders
www.biorxiv.org/content/10.1...
- Use sparse autoencoders (SAEs) to extract and analyze interpretable features from ESM-2-8M
November 19, 2024 at 1:44 AM
miRBench: A Comprehensive microRNA Binding Site Prediction Training and Benchmarking Dataset
Preprint: www.biorxiv.org/content/10.1...
GitHub:
github.com/katarinagres...
November 16, 2024 at 9:16 PM
AlphaBind, a Domain-Specific Model to Predict and Optimize Antibody-Antigen Binding Affinity
- Encode antibody and antigen with ESM2-nv (ESM2 but on NVIDIA), concatenate embeddings and feed into a lightweight transformer (4 attention heads, 7 layers) to predict binding affinity
November 16, 2024 at 9:01 PM
The word limit here makes it difficult to post a summary thread 🤔
November 16, 2024 at 9:00 PM