Bernardo P. de Almeida
bernardo-almeida.bsky.social
Bernardo P. de Almeida
@bernardo-almeida.bsky.social
Senior research scientist at InstaDeep, interested in building computational models that can read the human genome and interpret it's variation
Reposted by Bernardo P. de Almeida
Wild: Viable mice from 2 sperms. Very involved - among other things, it requires making a "20KO" first - 20 different loci deleted/edited to avoid imprinting effects. And low survival rate, but proof of concept nonetheless.

www.cell.com/cell-stem-ce...
Adult bi-paternal offspring generated through direct modification of imprinted genes in mammals
Li and colleagues successfully generated bi-paternal mice that developed to adulthood by targeting 20 key imprinted loci. This study underscores imprinting gene abnormalities as the primary barrier to...
www.cell.com
January 29, 2025 at 2:26 PM
Reposted by Bernardo P. de Almeida
New (and hotly anticipated - at least by me) preprint from my group describing a better way to partition training data for genomic-trained models to solve the long-neglected problem of homology-based data leakage. Thread from first author @muntakimrafi.bsky.social 👇
0/ Essential reading for anyone training or using sequence-function models trained on genomic sequences! 🚨 In our new preprint, we explore the ways homology within genomes can cause leakage when training sequence-based models and ways to prevent it
January 27, 2025 at 11:48 PM
Reposted by Bernardo P. de Almeida
0/ Essential reading for anyone training or using sequence-function models trained on genomic sequences! 🚨 In our new preprint, we explore the ways homology within genomes can cause leakage when training sequence-based models and ways to prevent it
January 27, 2025 at 11:04 PM
Reposted by Bernardo P. de Almeida
Excited to present the results of my 20% project in collaboration with @broadinstitute.org and @danafarber.bsky.social . In our new paper we demonstrate a long-range model capable of detecting regulatory elements at distances beyond a million base pairs.
A multi-modal transformer for cell type-agnostic regulatory predictions
Javed and Weingarten et al. created a multi-modal transformer that learns generalizable representations of genomic sequence and chromatin accessibility by utilizing a novel masked-accessibility pre-tr...
www.cell.com
January 30, 2025 at 4:51 PM
Reposted by Bernardo P. de Almeida
I wanted to write briefly about a very pleasant experience we recently had coordinating and collaborating closely on competing publications with 2 other teams. 1/
January 24, 2025 at 7:36 PM