suwenbin.bsky.social
suwenbin.bsky.social
@suwenbin.bsky.social
Reposted by suwenbin.bsky.social
New preprint with @mcagiada.bsky.social & @sokrypton.org in which we present a benchmark and predictions of absolute protein stability (ΔG not ΔΔG) using using likelihoods from a generative model, and also benchmark it for conformational free energies against NMR 🧬 🧶

doi.org/10.1101/2024...
March 16, 2024 at 10:21 AM
Reposted by suwenbin.bsky.social
PLMs trained on sequence/structural data that is highly local (e.g., for a single protein family) outperform those trained on global datasets (e.g., all protein sequences)

doi.org/10.1101/2024.03.15.585128
March 17, 2024 at 9:30 PM
Reposted by suwenbin.bsky.social
Predicting absolute protein folding stability using generative models https://www.biorxiv.org/content/10.1101/2024.03.14.584940v1
Predicting absolute protein folding stability using generative models https://www.biorxiv.org/content/10.1101/2024.03.14.584940v1
While there has been substantial progress in our ability to predict changes in protein stability due
www.biorxiv.org
March 16, 2024 at 12:48 AM
Reposted by suwenbin.bsky.social
The more worrisome outcome is that tech companies endanger patients by strong-arming their AI-designed de novo proteins into clinical trials without appropriate in vivo data, akin to what happened with self-driving cars/social media algorithms/a million other things

www.nature.com/articles/d41...
Could AI-designed proteins be weaponized? Scientists lay out safety guidelines
AI tools that can come up with protein structures at the push of a button should be used safely and ethically, say researchers in the field. AI tools that can come up with protein structures at the pu...
www.nature.com
March 13, 2024 at 7:50 AM
Reposted by suwenbin.bsky.social
“ProSTAGE: Predicting Effects of Mutations on Protein Stability by Using Protein Embeddings and Graph Convolutional Networks"

Obtains “high predictive accuracy even when using AlphaFold2 predicted structures as input”

doi.org/10.1021/acs....
github.com/GenScript-IB...
January 23, 2024 at 6:10 AM
Reposted by suwenbin.bsky.social
“Tpgen: a language model for stable protein design with a specific topology structure” has been published

doi.org/10.1186/s12859-024-05637-5
January 24, 2024 at 6:41 AM
Reposted by suwenbin.bsky.social
“Exploring Transition States of Protein Conformational Changes via Out-of-Distribution Detection in the Hyperspherical Latent Space”

Shows high energy states "exhibit a distributional shift from metastable states”

doi.org/10.26434/chemrxiv-2024-r8gjv
github.com/xuhuihuang/ts-dart
January 24, 2024 at 6:40 AM
Reposted by suwenbin.bsky.social
“TM-search: An Efficient and Effective Tool for Protein Structure Database Search” has been published

Applied TM-Align and iterative clustering to sift through huge DBs of protein structures faster than foldseek

doi.org/10.1021/acs....
January 26, 2024 at 6:10 AM
Reposted by suwenbin.bsky.social
“HybridDBRpred: improved sequence-based prediction of DNA-binding amino acids using annotations from structured complexes and disordered proteins” has been published

doi.org/10.1093/nar/...
January 26, 2024 at 6:10 AM
Reposted by suwenbin.bsky.social
“TransMEP: Transfer learning on large protein language models to predict mutation effects of proteins from a small known dataset”

www.biorxiv.org/content/10.1...
github.com/strodel-grou...
TransMEP: Transfer learning on large protein language models to predict mutation effects of proteins...
bioRxiv - the preprint server for biology, operated by Cold Spring Harbor Laboratory, a research and educational institution
www.biorxiv.org
January 16, 2024 at 7:01 AM
Reposted by suwenbin.bsky.social
“TemStaPro: protein thermostability prediction using sequence representations from protein language models” has been updated

www.biorxiv.org/content/10.1...
github.com/ievapudz/Tem...
January 17, 2024 at 6:29 AM
Reposted by suwenbin.bsky.social
“ProteinMPNN Recovers Complex Sequence Properties of Transmembrane β-Barrels”

Inverse folding methods implicitly learn membrane protein energy functions

www.biorxiv.org/content/10.1...
github.com/marissadolor...
January 18, 2024 at 5:33 AM
Reposted by suwenbin.bsky.social
“xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein” has been updated

This version describes xT-Fold, which appends the ESMFold trunk to the 100B PLM with favorable results

www.biorxiv.org/content/10.1...
January 12, 2024 at 6:23 AM
Reposted by suwenbin.bsky.social
“OmeDDG: Improved Protein Mutation Stability Prediction Based on Predicted 3D Structures” has been published

Stability prediction using OmegaFold models

doi.org/10.1021/acs....
January 12, 2024 at 6:23 AM
Reposted by suwenbin.bsky.social
“Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization”

A bandit-based sequence design approach for viral capsids

arxiv.org/abs/2401.06173
January 16, 2024 at 7:00 AM
Reposted by suwenbin.bsky.social
“PRO-LDM: Protein Sequence Generation with a Conditional Latent Diffusion Model” has been updated

www.biorxiv.org/content/10.1...
January 16, 2024 at 7:01 AM
Reposted by suwenbin.bsky.social
“Ancestral Sequence Reconstruction as a tool to detect and study de novo gene emergence”

Authors describe a workflow for de novo gene identification and find twenty genes that they believe emerged after the speciation of S. cerivisiae

www.biorxiv.org/content/10.1...
github.com/Nikos22/huma...
January 3, 2024 at 6:50 AM
Reposted by suwenbin.bsky.social
“ThermoFinder: A sequence-based thermophilic proteins prediction framework”

www.biorxiv.org/content/10.1...
github.com/Luo-SynBioLa...
January 3, 2024 at 6:50 AM
Reposted by suwenbin.bsky.social
“Protein language model powers accurate and fast sequence search for remote homology” has been updated

One of several recent methods to use embeddings rather than AA ID for sequence search/alignment problems

www.biorxiv.org/content/10.1...
dmiip.sjtu.edu.cn/PLMSearch
January 5, 2024 at 6:42 AM
Reposted by suwenbin.bsky.social
“State-of-the-RNArt: benchmarking current methods for RNA 3D structure prediction” has been updated

A review of how different methods have evolved over the last ~15 years

www.biorxiv.org/content/10.1...
January 6, 2024 at 7:51 AM
Reposted by suwenbin.bsky.social
“Homology detection using a protein secondary structure-based large language model”

www.biorxiv.org/content/10.1...
github.com/soroushv-dar...
December 21, 2023 at 4:36 AM
Reposted by suwenbin.bsky.social
“Position-specific evolution in transcription factor binding sites, and a fast likelihood calculation for the F81 model” has been updated

Uses “position-specific stationary vectors” like PSSMs for nucleotides instead of AAs

www.biorxiv.org/content/10.1...
github.com/rsidd120/TFB...
Position-specific evolution in transcription factor binding sites, and a fast likelihood calculation...
bioRxiv - the preprint server for biology, operated by Cold Spring Harbor Laboratory, a research and educational institution
www.biorxiv.org
December 22, 2023 at 10:31 AM
Reposted by suwenbin.bsky.social
“Leveraging ancestral sequence reconstruction for protein representation learning”

A “small but focused” PLM for specific families ends up yielding more accurate sequences and smoother fitness landscapes than ESM-1b

www.biorxiv.org/content/10.1...
December 22, 2023 at 10:32 AM
Reposted by suwenbin.bsky.social
“Prot2Text: Multimodal Protein's Function Generation with GNNs and Transformers” has been updated

arxiv.org/abs/2307.14367
github.com/hadi-abdine/...
December 23, 2023 at 4:40 AM
Reposted by suwenbin.bsky.social
December 13, 2023 at 6:06 AM