Andrew White 🐦‍⬛
banner
andrew.diffuse.one
Andrew White 🐦‍⬛
@andrew.diffuse.one
Head of Sci/cofounder at futurehouse.org. Prof of chem eng at UofR (on sabbatical). Automating science with AI and robots in biology. Corvid enthusiast
Making "AI Scientists" has become a hot topic lately. The first reference I could find was from 2008. The term has been used for 20 years! Like "Adam," an AI Scientist robot for studying yeast was published in 2009. I wrote a short post about the term and what it means now.

diffuse.one/p/w1-001
October 29, 2025 at 6:26 PM
I finished my estimate on required compute to make an atomic-resolution virtual cell: 10^38 FLOPs to simulate a human cell for 1 day. We should be able to do this simulation in 2074 using 200 TW of power. 1/3
September 26, 2025 at 3:19 PM
Our ether0 paper was accepted at NeurIPS 2025! Very proud of the FutureHouse team!
September 19, 2025 at 3:42 PM
Google scholar has a full-text index of nearly all research papers. You can use it to get counts for arbitrary phrases. I've been using this to measure popularity of things in science. For example, here's the popularity of Greek letters used in equations 1/3
September 14, 2025 at 4:52 PM
I've written up some thoughts on publishing for machines. 10M research papers are published per year and there are 227M total - machines will be primary producers and readers of publications going forward. Humans can simply not keep up. It's time to think about revising the scientific paper.
August 15, 2025 at 6:10 PM
HLE has recently become the benchmark to beat for frontier agents. We at FutureHouse took a closer look at the chem and bio questions and found about 30% of them are likely invalid based on our analysis and third-party PhD evaluations. 1/7
July 23, 2025 at 4:29 PM
Reposted by Andrew White 🐦‍⬛
1/4
🚀 Announcing the 2025 Protein Engineering Tournament.

This year’s challenge: design PETase enzymes, which degrade the type of plastic in bottles. Can AI-guided protein design help solve the climate crisis? Let’s find out! ⬇️

#AIforBiology #ClimateTech #ProteinEngineering #OpenScience
July 8, 2025 at 4:26 PM
I have written up a 3.5k word/10 figure essay on how to write a reward function while avoiding reward hacking for chemistry. It covers all the ridiculous ways we had to avoid reward hacking for training ether0, our scientific reasoning model.

diffuse.one/p/m1-000
June 22, 2025 at 3:21 PM
FutureHouse's goal has been to automate scientific discovery. Now we used our agents to make a genuine discovery – a potential new treatment for one kind of blindness (dAMD). We had multiple cycles of hypotheses, experiments, and data analysis – including identify the mechanism.
May 20, 2025 at 3:35 PM
We shipped multi-agents today! Our chemistry design agent can now call Crow, our scholarly research agents, to bring in data from literature/clinical trials/open targets while designing molecules.

platform.futurehouse.org
May 13, 2025 at 3:45 PM
Integrating @opentargets.org is so helpful to provide evidence for disease mechanisms independent of the literature. Here's a demo of synthesizing 78 papers and open targets to propose two novel targets for triple negative breast cancer

See the answer: platform.futurehouse.org/trajectories...
May 11, 2025 at 1:59 AM
We have an API for clinical trials on our platform - which means you can ask questions like "what trials will read out in June for NSCLC and how likely would you rate their success based on previous trials in the area." Pretty cool.

Answer: platform.futurehouse.org/trajectories...
May 9, 2025 at 4:54 PM
Here's a command that converts a DOI to bibtex:
May 6, 2025 at 10:45 PM
Reposted by Andrew White 🐦‍⬛
I always look forward to FutureHouse releases. I had to do a little digging for API information so here it is for those who are interested.
futurehouse.gitbook.io/futurehouse-...
May 2, 2025 at 10:52 PM
Reposted by Andrew White 🐦‍⬛
We have gotten some really good responses to science questions from platform.futurehouse.org already. Both from "Crow" (short answers) and "Falcon" (deep research).

It looks like this is state of the art right now!
May 2, 2025 at 10:03 PM
Really happy to have this available on an API and free, today!
Today, FutureHouse is launching the FutureHouse Platform, bringing the first-ever superintelligent scientific AI agents to scientists everywhere via a web interface and API. The Platform is launching with four agents, each with their own specialization:
May 1, 2025 at 4:16 PM
The plan at FutureHouse has been to build scientific agents for discoveries. We’ve spent the last year researching the best way to make agents. We’ve made a ton of progress and now we’ve engineered them to be used at scale, by anyone. Free and on API.
May 1, 2025 at 4:16 PM
Sam Cox and I are giving the MIA seminar at the Broad Institute in Boston tomorrow. Going to tease some new results on something unrelated to scientific agents and squarely in domain of chemistry.
March 18, 2025 at 9:13 PM
It's ridiculous, but there hasn't existed a one-liner to quickly get functional groups of a molecule. Little Friday night coding exercise to get this working.

Enjoy - and let me know of any missing functional groups! I could only do a few hundred.
March 8, 2025 at 7:58 AM
Half of an AI scientist is rejecting or accepting hypotheses. FutureHouse and Science Machines just put out ~300 novel hypotheses from ~50 published papers along with ground-truth data. Humans take 4.2 hours to solve these and frontier models get 10-20% correct.

This is like SWE-bench for comp bio
March 4, 2025 at 4:41 PM
We should start using SI notation for token counts - like 1 megatoken context window or 64 kilotoken reasoning model.
Then we can write: 64kt or 1mk etc.

Or you can say - "my prompt is 1.6 kilotokens" - which sounds badass
February 25, 2025 at 9:49 PM
PaperQA2 can now work with clinical trials. It considers both research papers and clinical trials jointly to answer complex questions. It uses the the clinicial trials dot gov API - so it can do complex queries too. Checkout the tutorial below:

futurehouse.gitbook.io/futurehouse-...
February 25, 2025 at 4:27 PM
It's been about a month since the first batch of reasoning models was released. There’s been about a dozen reproductions since then and some patterns are emerging. I’ve written up my own notes on training recipes, frameworks, rumors, and major open questions.

diffuse.one/p/d2-000
February 22, 2025 at 8:09 PM
Image duplication has been a powerful signal for detecting scientific fraud, but is irrelevant in many fields. I've been working a bit on finding new signals like it that work across fields. I've found one using LLMs that can predict retractions, weakly, for $1 per paper. 1/4
February 19, 2025 at 1:26 PM
Molecular dynamics requires a lot of expert knowledge to set-up and analyze simulations. We set out to automate it with LLM agents: MDCrow!
February 14, 2025 at 11:05 PM