Sam Blau
banner
samblau.bsky.social
Sam Blau
@samblau.bsky.social
Research scientist & computational chemist at Berkeley Lab using HT DFT workflows, machine learning, and reaction networks to model complex reactivity.
Pinned
The Open Molecules 2025 dataset is out! With >100M gold-standard ωB97M-V/def2-TZVPD calcs of biomolecules, electrolytes, metal complexes, and small molecules, OMol is by far the largest, most diverse, and highest quality molecular DFT dataset for training MLIPs ever made 1/N
I'm hiring postdocs @berkeleylab.lbl.gov to drive cutting-edge research involving MLIPs, high-throughput workflows, chemical reaction networks, generative models, and open-source software dev. Full position description + application here: forms.gle/zePBZDmciXez... #Chempostdoc #AI4Science
forms.gle
November 4, 2025 at 9:16 PM
Reposted by Sam Blau
Interested in learning more about our recently published OMol25 dataset and the advances that it's bringing to atomistic machine learning? Check out this talk that my boy @samblau.bsky.social gave as part of the "Modeling Talk Series".

#CompChem ⚗️ 🧪 #SciML
Modeling Talk Series - The Open Molecules 2025 (OMol25) Dataset, Evaluations, and Models
Samuel Blau, Berkeley Lab Video Recording Slides (pptx, pdf)
sites.google.com
July 31, 2025 at 10:44 AM
I'm presenting OMol25 tomorrow 7/29 at 9 AM PST as part of a talk series at Google. Learn how we built the dataset + how MLIPs trained on OMol are revolutionizing comp chem!
Meet: lnkd.in/g4AAWkcK
YouTube Stream: lnkd.in/ggmtMtTR
Join group: lnkd.in/g5ciuNuX
July 29, 2025 at 2:42 AM
OMol25 was calculated with ORCA. I want to acknowledge the work of the ORCA team to improve the quality of the gradient + the robustness of SCF convergence for complicated systems as part of the OMol effort - it was much appreciated and critical to ensuring that we're releasing high quality data!
May 15, 2025 at 7:27 PM
Reposted by Sam Blau
🚨 Just dropped: Open Molecules 2025 — a record-breaking dataset co-led by Berkeley Lab + Meta FAIR.

100M+ DFT snapshots. Built to train #AI for real-world chemistry 🧪.

Could reshape discovery in batteries, drug discovery & much more! @cs.lbl.gov ⬇️
Computational Chemistry Unlocked: A Record-Breaking Dataset to Train AI Models has Launched - Berkeley Lab
Scientists will finally be able to simulate the chemistry that drives our bodies, our environment, and our technologies.
newscenter.lbl.gov
May 14, 2025 at 4:47 PM
The Open Molecules 2025 dataset is out! With >100M gold-standard ωB97M-V/def2-TZVPD calcs of biomolecules, electrolytes, metal complexes, and small molecules, OMol is by far the largest, most diverse, and highest quality molecular DFT dataset for training MLIPs ever made 1/N
May 14, 2025 at 8:52 PM
It was a pleasure to give an IIDAI seminar on nanoparticle ML for gradient-based heterostructure optimization (w/ @emorychannano.bsky.social ) and neural network path opt for finding reaction transition states on MLIPs (w/ @thglab.bsky.social) - find the talk here: www.youtube.com/watch?v=-4jB...
IIDAI Seminar, 5/1/2025, Samuel M. Blau (Berkeley Lab)
YouTube video by Coordinated Science Laboratory
www.youtube.com
May 1, 2025 at 9:56 PM
Reposted by Sam Blau
🧠 New postdoctoral researcher position at Princeton for those interested in data science and machine learning! Specify my group if you are interested in working together. Deadline is May 31. Details: puwebp.princeton.edu/AcadHire/app...
puwebp.princeton.edu
May 1, 2025 at 9:52 PM
Final day to submit abstracts for ACS Fall 2025! Reminder that @ewcspottesmith.bsky.social , Brett Savoie (Notre Dame), and I are organizing a symposium on "Chemical Reaction Networks, Retrosynthesis, and Reaction Prediction". Will be a mix of invited and contributed talks - please submit! #CompChem
March 31, 2025 at 4:29 PM
Reposted by Sam Blau
the @gpggrp.bsky.social is at the ACS Spring 2025! come check out the works of Daniil Boiko and Rob MacKnight at the "ML + AI in Organic Chemistry" Symposium (Hall B-1, Room 4) today! extreme scaling of experimental chemical reactions via MS and an OS for autonomous comp chem!
March 24, 2025 at 4:56 PM
Looking forward to speaking at ACS on Sunday at 5:30! Come learn about "Popcornn" - a new method for double-ended transition state optimization atop machine learned interatomic potentials that is substantially better than NEB or GSM.
March 21, 2025 at 11:32 PM
Fantastic new work from Aditi & co that shows how to leverage the expressivity + accuracy of massive pre-trained MLIPs to distill smaller, much faster models that are still extremely accurate to drive downstream simulations - no need to compromise on speed vs accuracy!
1/ Machine learning force fields are hot right now 🔥: models are getting bigger + being trained on more data. But how do we balance size, speed, and specificity? We introduce a method for doing model distillation on large-scale MLFFs into fast, specialized MLFFs! More details below:

#ICLR2025
March 13, 2025 at 3:11 PM
Applications closing in one week! If you’re interested in a prestigious postdoc at the intersection of AI/ML and nuclear nonproliferation, don’t hesitate to apply - come work with me on fascinating f-block chemistry and computational/ML methods! (Must be a US citizen)
January 24, 2025 at 9:09 PM
Reposted by Sam Blau
January 10, 2025 at 3:14 AM
Reposted by Sam Blau
@samblau.bsky.social, Brett Savoie (Notre Dame), and I are organizing a symposium for @amerchemsociety.bsky.social Fall 2025 called "Chemical Reaction Networks, Retrosynthesis, and Reaction Prediction" under @acscomp.bsky.social.

#reactionnetwork #CRN #retrosynthesis 🧪 ⚗️ #CompChem
January 8, 2025 at 2:22 PM
Reposted by Sam Blau
Inverse Design of Complex Nanoparticle Heterostructures via Deep Learning on Heterogeneous Graphs

Authors: Eric Sivonxay, Lucas Attia, Evan Walter Clark Spotte-Smith, Benjamin Lengeling, Xiaojing Xia, Daniel Barter, Emory Chan, Samuel Blau
DOI: 10.26434/chemrxiv-2024-1dw4q
December 26, 2024 at 12:41 PM
Very proud of this work, going all the way from implementing the kMC in C++ to building datasets w/ high-throughput workflows to designing the novel graph representation to training the custom hetero-GNN w/ on-the-fly augmentation to inverse design of novel nanoparticles with GNN-based optimization!
December 27, 2024 at 12:03 AM
Reposted by Sam Blau
Long-range machine learning potentials strike again! 🚀 We benchmarked the Latent Ewald Summation method on diverse systems—molecules, solutions, interfaces. Learning just from energy & forces, it delivers the most accurate potential energy surfaces, physical charges, dipoles, and quadrupoles!
Learning charges and long-range interactions from energies and forces
Accurate modeling of long-range forces is critical in atomistic simulations, as they play a central role in determining the properties of materials and chemical systems. However, standard machine lear...
arxiv.org
December 23, 2024 at 5:08 PM
Reposted by Sam Blau
This week, "RNMC: kinetic Monte Carlo implementations for complex reaction networks" was published in @joss-openjournals.bsky.social. Work on RNMC started back in 2021, when brilliant mathematician Daniel Barter suggested using stochastic methods to study #reactionnetworks. #CompChem #ChemSky 🧪 1/6
RNMC: kinetic Monte Carlo implementations for complex reaction networks
Zichi et al., (2024). RNMC: kinetic Monte Carlo implementations for complex reaction networks. Journal of Open Source Software, 9(104), 7244, https://doi.org/10.21105/joss.07244
joss.theoj.org
December 18, 2024 at 1:27 PM
Reposted by Sam Blau
I've been slacking on research updates here! A few weeks ago, a preprint for "HEPOM: Using Graph Neural Networks for the accelerated predictions of Hydrolysis Free Energies in different pH conditions" dropped on @chemrxiv.bsky.social. 1/5
HEPOM: Using Graph Neural Networks for the accelerated predictions of Hydrolysis Free Energies in different pH conditions.
Hydrolysis is a fundamental family of chemical reactions where water facilitates the cleavage of bonds. The process is ubiquitous in biological and chemical systems, owing to water's remarkable versat...
chemrxiv.org
December 18, 2024 at 1:09 PM
Reposted by Sam Blau
Another chance to join our group! We are recruiting a PhD student in digital ligand engineering for nanocatalysis. Reposts and spreading the word to interested people in your network appreciated!
jobs.ethz.ch/job/view/JOP...
PhD position in digital ligand engineering for nanocatalysis
jobs.ethz.ch
December 10, 2024 at 3:29 PM
Example nanoparticle heterostructure optimization, driven by gradients of UV emission with respect to layer thicknesses and dopant concentrations from our hetero-GNN (not accessible from kMC) and sub-second inference (vs days from kMC) #F24MRS
December 2, 2024 at 3:52 PM
Excited to speak at #F24MRS Thurs 1:30 - 1st talk of my career w/o any DFT connection - we design a hetero-GNN for learning core-shell nanoparticle properties, train on first ever large-scale NP kMC dataset, and use autodiff to optimize -> discover far OOD heterostructures with >6x enhanced emission
December 2, 2024 at 3:51 PM
Reposted by Sam Blau
We are hiring (resharing appreciated)!

Given recent successful grant applications (I got my SNSF Starting Grant 🚀), we are extending the LIAC team with multiple openings (PhD/postdoc) for 2025.

Apply now (deadline: December 20th) by filling in this form: forms.fillout.com/t/eq5ADAw3kkus.
#ChemSky
December 2, 2024 at 10:33 AM