Pia Rautenstrauch
banner
prauten.bsky.social
Pia Rautenstrauch
@prauten.bsky.social
Computer Science PhD Student at @humboldtuni.bsky.social and @mdc-berlin.bsky.social | Data Science | Machine learning | AI | Bioinformatics | Genomics | Single-Cell Biology
Pinned
1/4 🧵 Preprint alert: In "Metrics Matter: Why We Need to Stop Using Silhouette in #SingleCell #Benchmarking," we reveal critical flaws in common #Evaluation metrics for #Integration and propose robust alternatives. @uweohler.bsky.social
www.biorxiv.org/content/10.1...
Metrics Matter: Why We Need to Stop Using Silhouette in Single-Cell Benchmarking
Current-day single-cell studies comprise complex data sets affected by nested batch effects caused by technical and biological factors, relying on advanced integration methods. Silhouette is an establ...
www.biorxiv.org
Reposted by Pia Rautenstrauch
🧠 The Lipid #Brain Atlas is out now! If you think #lipids are boring and membranes are all the same, prepare to be surprised. Led by @lucafusarbassini.bsky.social with Giovanni D'Angelo's lab, we mapped membrane lipids in the mouse brain at high resolution.
www.biorxiv.org/cgi/content/...
October 16, 2025 at 6:23 AM
Reposted by Pia Rautenstrauch
We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.
www.biorxiv.org/content/10.1...
(1/n)
September 22, 2025 at 5:29 AM
Reposted by Pia Rautenstrauch
I did not know Taylor Swift was moonlighting in soliciting contributions for fake journals!
September 16, 2025 at 7:29 PM
Check out my talented colleagues' study, profiling hundreds of CRISPRa-responsive regulatory elements surrounding PHOX2B, a key player in neuroblastoma, using a targeted scRNA-seq screen in a neuroblastoma cell line.
I am so happy to share that our paperis officially published in Cell Genomics! In this paper, we describe TESLA-seq, which combines pooled CRISPR activation with targeted single-cell RNA-seq to map enhancer-gene connections at high sensitivity.

Link to the full story: www.cell.com/cell-genomic...
September 13, 2025 at 4:21 PM
Reposted by Pia Rautenstrauch
Our first Fall #tidyomics meeting will be this Wed 10 September, early in US / noon in Europe / late in Australia. Feel free to join if you're interested in what we are doing to make omics data more amenable to tidy data analysis.

Organized with Stefano @stemang.bsky.social
September 8, 2025 at 7:45 PM
Reposted by Pia Rautenstrauch
L’effet Matilda n’est pas une fiction.
Il est inscrit dans l’histoire scientifique.
Il a éclipsé des femmes comme Marthe Gautier, née il y a cent ans, pionnière oubliée de la trisomie 21.
➡️ https://l.franceculture.fr/1LI
September 10, 2025 at 4:00 AM
Reposted by Pia Rautenstrauch
Are electronic health records (EHR) more predictive of disease onset than polygenic scores? Can we transfer EHR-based prediction models between countries? Our study on these questions using 3 biobank-based studies with N>845K, is out in @natgenet.nature.com today:

www.nature.com/articles/s41...
Cross-biobank generalizability and accuracy of electronic health record-based predictors compared to polygenic scores - Nature Genetics
Comparison of electronic health record-based phenotype risk scores (PheRS) and polygenic scores (PGS) across 13 common diseases and three biobank-based studies indicates that PheRS and PGS may provide...
www.nature.com
August 27, 2025 at 2:15 PM
Reposted by Pia Rautenstrauch
Last year I met a bunch of great researchers who work with high-dimensional data at a Dagstuhl seminar. This week we put out a preprint about the history and philosophy of low-dimensional embedding methods, their applications, their challenges, and their possible future arxiv.org/abs/2508.15929
August 27, 2025 at 1:25 PM
Reposted by Pia Rautenstrauch
We spent a year writing this review of low-dim embeddings and arguing about things like epistemic roles and best practices :-) 20+ authors are all participants of the Dagstuhl seminar we held last year: www.dagstuhl.de/24122. Led by @alexandr.bsky.social and Cyril de Bodt.

arxiv.org/abs/2508.15929
August 27, 2025 at 3:14 PM
Reposted by Pia Rautenstrauch
We're committed to support as many attendees as possible join us at #scverse2025 - feel free to reach out if you have questions!
💰 Travel Grants Available for scverse conference 2025! 💰
Did you know we are offering grants to help anyone in financial need attend our annual conference? 🌍
🧵

#scverse #scverse2025 #SingleCell #Conference #StanfordUniversity #TravelGrant
August 25, 2025 at 5:17 PM
Reposted by Pia Rautenstrauch
Antibodies are highly diverse, but most possible sequences are unstable or polyreactive. In this work, just published in Cell Syst., we propose a new source of data for modeling constraints from these properties. Our models show clear improvements in predicting Ab dysfunction. (1/n)
t.co/qCZERPUMPF
https://authors.elsevier.com/a/1lbX08YyDfuZWX
t.co
August 15, 2025 at 1:17 PM
Reposted by Pia Rautenstrauch
Excited to share our latest paper @natmethods.nature.com
We present a high-throughput framework to map cellular interactions at ultra-high scale – broadly applicable from whole-organism immune response mapping to personalized therapy response prediction (1/4).
www.nature.com/articles/s41...
August 7, 2025 at 11:24 AM
Reposted by Pia Rautenstrauch
This preprint from Helen Sakharova is one of the coolest things to come out of my lab: “Protein language models reveal evolutionary constraints on synonymous codon choice.” Codon choice is a big puzzle in how information is encoded in genomes, and we have a new angle. www.biorxiv.org/content/10.1...
Protein language models reveal evolutionary constraints on synonymous codon choice
Evolution has shaped the genetic code, with subtle pressures leading to preferences for some synonymous codons over others. Codons are translated at different speeds by the ribosome, imposing constrai...
www.biorxiv.org
August 7, 2025 at 8:29 AM
Reposted by Pia Rautenstrauch
Evaluating something like batch correction requires looking at the data, and picking metrics that capture what you care about. Great work @prauten.bsky.social and @uweohler.bsky.social
August 5, 2025 at 12:48 PM
Truly grateful for the exceptional opportunity to participate in #LPSHG2025 last week, featuring a stellar ✨ lineup of leading researchers who doubled as tutors, alongside inspiring fellow PhD students. Excited to apply my learnings and see where this collaborative spirit takes genomics next!
August 1, 2025 at 11:49 AM
Reposted by Pia Rautenstrauch
*Easter egg alert* NOT in the published paper. We also benchmarked Evo 2 and while it did better than other gLMs (consistent that scale can improve gLMs), it still falls short of a basic CNN trained using one-hot sequences and far short of supervised SOTA
July 16, 2025 at 12:16 PM
Reposted by Pia Rautenstrauch
The duplication crisis: the other replication crisis - www.worksinprogress.news/p/the-duplic...
The duplication crisis: the other replication crisis
How bad publishing incentives hinder long-term thinking in computational biology research
www.worksinprogress.news
June 2, 2025 at 7:41 PM
Reposted by Pia Rautenstrauch
The deadline for the VIB.AI group leader positions is approaching - send in your CV and short research plan before 14th June to start your BioML research lab in Leuven or Ghent
We want to connect:
To link model builders with data generators.
To bring together scientists asking why cells behave the way they do, and others figuring out how to model that behavior.

If you're working on AI in biology, consider joining!
https://tinyurl.com/y35m6khy
June 4, 2025 at 7:16 AM
Reposted by Pia Rautenstrauch
Excited to share my first contribution here at Illumina! We developed PromoterAI, a deep neural network that accurately identifies non-coding promoter variants that disrupt gene expression.🧵 (1/)
May 29, 2025 at 11:57 PM
Reposted by Pia Rautenstrauch
We finally concluded the meeting. Thanks to all attendees for their scientific contributions and for traveling (near or far) to the meeting! Thanks to the local organizers for the infrastructure and catering, and thanks to the co-organizers @yaronorenstein.bsky.social @camillemrcht.bsky.social!
April 25, 2025 at 8:18 AM
Reposted by Pia Rautenstrauch
When investors learn that the trait for green eyes is also ~20 SNPs
April 14, 2025 at 12:45 AM
Reposted by Pia Rautenstrauch
1. LLM-generated code tries to run code from online software packages. Which is normal but
2. The packages don’t exist. Which would normally cause an error but
3. Nefarious people have made malware under the package names that LLMs make up most often. So
4. Now the LLM code points to malware.
LLMs hallucinating nonexistent software packages with plausible names leads to a new malware vulnerability: "slopsquatting."
LLMs can't stop making up software dependencies and sabotaging everything
: Hallucinated package names fuel 'slopsquatting'
www.theregister.com
April 12, 2025 at 11:43 PM
Reposted by Pia Rautenstrauch
40 days until #RECOMB2025 in Seoul! 523 attendees confirmed—thank you! 🥳

For those who haven't registered yet, we have great news!
📢 Early bird deadline is extended to Friday, March 21st 📢
Register now at recomb2025.com 🎟️
March 15, 2025 at 2:08 PM
Reposted by Pia Rautenstrauch
Last day to apply ⏰😱!

If you haven’t already done so, now is the time! Apply to be one of the speakers at the #SoapboxScience summer event and present your research in a fun and relaxed atmosphere ✨

#WomenInSTEM #WomenInScience #Scicomm
February 21, 2025 at 10:25 AM