Dhuvi Karthikeyan
banner
dkarthikey1.bsky.social
Dhuvi Karthikeyan
@dkarthikey1.bsky.social
Shape rotator and vibes curator at UNC’s Personalized Immunotherapy Research Lab. Generative modeling and representation learning in biology. 👨‍🏫👨‍🔬
Now out in @natmachintell.nature.com

TCRT5 is a rapid generator of target-conditioned CDR3b, leads SoTA, and yields the first AI-designed self-tolerant binder to an OOD non-viral epitope (w val)

📑: www.nature.com/articles/s42...
🤗: huggingface.co/dkarthikeyan1
👨‍💻: github.com/pirl-unc/tcr_translate
September 9, 2025 at 9:25 AM
On the merits of reading papers, attending conferences, preprinting manuscripts… Overseen at the Library of Congress.
December 30, 2024 at 2:17 AM
🧙🪄Summoning Alex to 💙🦋 so the Bio x ML community can celebrate ESM-C with him on here as well
December 4, 2024 at 8:26 PM
CA Coworking 🌉🌅
November 26, 2024 at 11:33 PM
TCRT5 even generates better than random sequences against the notoriously challenging IMMREP2023 "private" antigen set, and gets 1/12 known sequences for one of them. (PS - Thank you @Giulio (not on here, yet) for a brilliant model of V(D)J recombination and thymic selection in soNNia)
November 19, 2024 at 6:54 PM
This key finding helped us choose our flagship TCRT5-FT model which we evaluate with a more qualitative lens. We find that TCRT5:

>preferentially samples sequences with high biological pGen
>generates antigen-specific sequences to unseen epitopes
>generates sequences not seen during training
November 19, 2024 at 6:54 PM
POLYSPECIFIC TCRS! It turns out the bidirectional models sample more empirically de-risked CDR3b sequences, or sequences that appear a lot in the training data and have been shown to bind highly dissimilar inputs. Think super-sequences, with high empirical utility.
November 19, 2024 at 6:54 PM
🗣️ @Protein Designers, something for yall: We tried bi-directional and multi-task translation (target-conditioned receptor generation and receptor-conditioned target generation) to see if self-consistency improves the sequence quality for TCR generation, and we found something rather interesting...
November 19, 2024 at 6:54 PM
TCR-TRANSLATE frames this is a seq2seq task (w tricks from low-resource machine translation to handle sparsity). Given a target of interest, can we generate a faithful conditional distribution over cognate sequences and sample from it to get an epitope-specific TCR repertoire?
November 19, 2024 at 6:54 PM
My first Skeetorial!

💻🧬TCR-TRANSLATE - A new framework for thinking about the TCR:pMHC specificity problem.

TLDR:
We pretrained LLMs on ~8M TCR & pMHC seqs
Finetuned on sparse pMHC->TCR pair data
Validated CDR3b sequences to unseen antigens
>> random performance on IMMREP2023 "private" antigens
November 19, 2024 at 6:54 PM