Artur Szałata
chatgtp.bsky.social
Artur Szałata
@chatgtp.bsky.social
Machine learning for molecular biology. ELLIS PhD student at Fabian Theis lab. EPFL alumnus.
Pinned
Interested in predicting transcriptomic effects of perturbations? Check out our @NeurIPS24 D&B spotlight living perturbation prediction benchmark & new drug perturbation dataset:
- paper: openreview.net/forum?id=WTI... !
- benchmarking platform: openproblems.bio/results/pert...
🧵1/8
A benchmark for prediction of transcriptomic responses to chemical...
Single-cell transcriptomics has revolutionized our understanding of cellular heterogeneity and drug perturbation effects. However, its high cost and the vast chemical space of potential drugs...
openreview.net
Reposted by Artur Szałata
And for a lot of Gen X journalists and academics, the answer to (a) — assuming existing skills and plans — is legit "no." AI can be useful, for sure, but the paths it differentially advantages are not the paths where they have accumulated momentum, expertise, and social capital. +
January 19, 2026 at 12:18 AM
A quick story on how we matched genes across two datasets with different Ensembl versions.
1. There must be a tool out there. Ensembl ID History converter ofc!
2. Doesn't match Ensembl search outcomes due to a bug
3. Lesson: use this client instead github.com/Ensembl/ense... !
Mapping of ENSG_IDs between different release of the Ensembl database · Issue #744 · Ensembl/ensembl
Dear members of the Ensembl team, I wasn’t sure who to contact, so I’m starting here. I am writing to ask you questions about the IDMapper tool presented on your website: https://www.ensembl.org/Ho...
github.com
January 15, 2026 at 11:01 PM
Also awesome for finding posters at a conference!
I continue to be impressed with Scholar Inbox!

- Personal recommendations the day after the papers land on arXiv (significantly faster than Google Scholar?? 😱)

- Very good recommendations

- Email notifications for digests

- Super slick UI, both on mobile and desktop... check this out! 👇
January 13, 2026 at 1:02 PM
Reposted by Artur Szałata
Predicting cell state in previously unseen conditions has typically required retraining for each new biological context. Today, Arc is releasing Stack, a foundation model that learns to simulate cell state under novel conditions directly at inference time, no fine-tuning required.
January 9, 2026 at 6:44 PM
Reposted by Artur Szałata
Introducing DroPE: Extending Context by Dropping Positional Embeddings

We found embeddings like RoPE aid training but bottleneck long-sequence generalization. Our solution’s simple: treat them as a temporary training scaffold, not a permanent necessity.

arxiv.org/abs/2512.12167
pub.sakana.ai/DroPE
January 12, 2026 at 4:07 AM
Reposted by Artur Szałata
One very familiar pattern in AI and science right now is going from a lot of false starts on hard tasks (there have been near-misses where AI appears to solve an Erdos problem but just finds an old solution no one knew about) to actually doing the thing soon after.

Three Erdos problems in 3 days.
January 11, 2026 at 12:39 PM
Reposted by Artur Szałata
We introduce epiplexity, a new measure of information that provides a foundation for how to select, generate, or transform data for learning systems. We have been working on this for almost 2 years, and I cannot contain my excitement! arxiv.org/abs/2601.03220 1/7
January 7, 2026 at 5:28 PM
Reposted by Artur Szałata
December 30, 2025 at 1:40 PM
Reposted by Artur Szałata
Autonomous RIVR delivery robots in Pittsburgh
December 24, 2025 at 6:56 PM
Reposted by Artur Szałata
A quick sunday rewrite of an old blog post about how one should evaluate the effectiveness of an empirical paper:
open.substack.com/pub/emergere...
How much should I trust this paper?
Or how to build your castle on sand
open.substack.com
December 14, 2025 at 5:34 PM
Reposted by Artur Szałata
Good researchers obsess over evals
The story of Olmo 3 (post-training), told through evals
NeurIPS Talk tomorrow.
Upper Level Room 2, 10:35AM.
Slides: docs.google.com/presentation...
December 6, 2025 at 8:35 PM
Reposted by Artur Szałata
Elon’s power is that he offers a positive vision of the future. This attracts employees, funding, support. There’s a massive techno positive hole and he fills it.
November 17, 2025 at 2:02 PM
Active learning with DrugReflector beats SotA in phenotypic hit-rate for virtual screening. Includes a sc perturbation dataset with 10 lines and 104 compounds. Out in @science.org now!
Grateful to Cellarity and @fabiantheis.bsky.social for the opportunity to contribute to this outstanding project!
Active learning framework leveraging transcriptomics identifies modulators of disease phenotypes
Phenotypic drug screening remains constrained by the vastness of chemical space and technical challenges scaling experimental workflows. To overcome these barriers, computational methods have been dev...
www.science.org
October 23, 2025 at 7:41 PM
Reposted by Artur Szałata
What if we did a single run and declared victory
October 23, 2025 at 2:28 AM
Reposted by Artur Szałata
Community notes when
October 13, 2025 at 4:16 AM
Reposted by Artur Szałata
Yeah this is my biggest “AGI hype is not real” is that almost no one at these companies behaves like it’s real
October 11, 2025 at 8:58 PM
Reposted by Artur Szałata
My skepticism of LLM-as-scientist stems from how imbalanced the literature is. Median paper is mildly negative result presented as positive, it's unclear how to RLHF on good hypothesis vs. bad hypothesis, etc. We barely know how to teach this skill, how can we RLHF it
September 28, 2025 at 8:40 PM
Reposted by Artur Szałata
For folks considering grad school in ML, my advice is to explore programs that mix ML with a domain interest. ML programs are wildly oversubscribed while a lot of the fun right now is in figuring out what you can do with it
September 25, 2025 at 3:25 AM
A must-read before you jump on your first omics project - the top response here www.reddit.com/r/bioinforma...
Here0s0Johnny's comment on "Exemplary papers on multi-OMICS integration with solid storytelling"
Explore this conversation and more from the bioinformatics community
www.reddit.com
August 28, 2025 at 6:06 PM
Reposted by Artur Szałata
I think scientists thought people could tell apart the serious science from the bad fluff and ideological work that we all mostly ignore. We were not ready for people to start conflating all of them together
August 23, 2025 at 6:20 PM
Reposted by Artur Szałata
The more rigorous peer review happens in conversations and reading groups after the paper is out with reputational costs for publishing bad work
August 17, 2025 at 4:12 PM
Reposted by Artur Szałata
There are people, in tech (and now in the government!), who will mislead you about what current AI models are capable of. If we don't call them out, they'll drag us all down.
Reporter: The FDA has a new AI tool that's intended to speed up drug approvals. But several FDA employees say the new AI helper is making up studies that do not exist. One FDA employee telling us, 'Anything that you don't have time to double check is unreliable. It hallucinates confidently'
July 23, 2025 at 8:01 PM
Reposted by Artur Szałata
Oops I read my parrot a math textbook and now it keeps squawking out the answer to unseen math competitions
July 22, 2025 at 1:09 PM
Excited to share that I started my summer at @genentech.bsky.social BRAID Perturbation team in SF with Alex Wu!

It's my first time on the West Coast - If you are around and would like to talk about ML and/or biology, hit me up!

Looking fwd to the AI x Bio Unconference tomorrow 🚀
June 18, 2025 at 4:37 AM