Joel Shor
joelshor.bsky.social
Joel Shor
@joelshor.bsky.social
AI for drug discovery, AI for social good, retired cultural truffle pig
September 11, 2025 at 5:59 PM
Please check out our paper, or try our benchmark / designers. If you have any questions, contact info@move37labs.ai. Happy designing!
June 26, 2025 at 6:22 PM
Using insights from NucleoBench, we publish a new optimizer, AdaBeam (Adaptive Beam Search), that outperforms existing algorithms on 11 of 16 tasks and demonstrates superior scaling properties. Both NucleoBench and AdaBeam are available on github.
June 26, 2025 at 6:22 PM
As design sequences get longer and task models get larger, which designers lose performance?
June 26, 2025 at 6:22 PM
Which designers performed best, and on which tasks? What is the role of start sequences and randomness on performance? Which designers converge more quickly? Do some tasks start sequences that are inherently “harder,” and how often do those occur?
June 26, 2025 at 6:22 PM
NucleoBench emphasizes the design of large sequences and using large models. Running over 400K experiments in apples-to-apples comparisons, we are for the first time able to address a number of important questions.
June 26, 2025 at 6:21 PM
you can start to fix the model bias without waiting for the expensive and manually-intensive labeling process to complete. This can be hundreds of millions of dollars and decades, when talking about certain kinds of medical data.
November 27, 2024 at 7:53 PM
have a bunch of doctors label it, then evaluate the model on it and learn about the decrease in performance. If, instead, you can use the model that Carson and I just made, and just use skin photos to predict that the model won't do very well,
November 27, 2024 at 7:53 PM
Say some international skin rash classification model, which normally finds cancer at 98% accuracy, will only work get to 70% accuracy on Thai women over the age of 65. Normally, to figure that out, you'd have to collect a bunch of skin images from that group, get their medical records or
November 27, 2024 at 7:53 PM
Question from a friend: "Fascinating! Can you explain your findings to someone with non-expert familiarity with AI models and say more about the implications? For example I don't understand why that helps identify bad model performance."

Answer:
November 27, 2024 at 6:00 PM
Why this matters:
- generalizing medical AI models to new patient populations
- indicating when performance might be at risk
- prioritizing which data should be collected and labeled to include in training.
November 20, 2024 at 4:12 PM