Egor Marin
marinegor.bsky.social
Egor Marin
@marinegor.bsky.social
ML Scientist @ ENPICOM B.V. (Den Bosch, Netherlands)

computational biology, ML, protein design, cheminformatics, fancy dev tooling, tinge of bouldering

https://marinegor.dev
Reposted by Egor Marin
So your data are available upon reasonable request? Well, we are making some reasonable requests - at scale. :)

1. Search literature (currently stubbed)
2. Enumerate papers, extract contacts
3. Send email w/ data drop location
4. Parse data

Does anyone want to help productionize this?
November 21, 2025 at 9:32 PM
Reposted by Egor Marin
The 2025 MDAnalysis User Group Meeting wrapped up. If you want to see what great talks and workshops we had, have a look at the UGM2025 repo github.com/MDAnalysis/U... . It was fantastic to have so many of you in Arizona and joining online! See you all again soon.
November 12, 2025 at 12:09 AM
Just had my "damn I love open-source" moment yesterday: have been struggling with a custom new reader for @mdanalysis.bsky.social, spent two big evenings on that, and decided to let it go and just ask for help in org's discord & github.
November 6, 2025 at 12:29 PM
Oh, and it includes videos of cryoEM grid freezing with laser illumination✨
July 7, 2025 at 10:30 PM
Oh indeed
Quitting programming as a career right now because of LLMs would be like quitting carpentry as a career thanks to the invention of the table saw.
July 4, 2025 at 11:05 AM
Reposted by Egor Marin
#mdaUGM2025 is officially happening! Abstracts and travel bursary applications are being accepted from now until the July 15, 2025 deadline: www.mdanalysis.org/2025/04/13/u....

🗓️ November 9-11, 2025
📍Tempe, Arizona, USA (and online)

#open-source-software #molecular #simulations #streaming
May 20, 2025 at 6:24 PM
if anyone wants to download them to, say, set use them as wallpapers (macOS settings: "Fit to screen", shuffle every day from folder), I ~~vibe-coded~~ wrote a little script that downloads the whole archive!

For uv users, just do:

```
uv run gist.githubusercontent.com/marinegor/86...
```
April 4, 2025 at 9:15 AM
Reposted by Egor Marin
MDAnalysis 2.9.0 (and its blog post) is out: www.mdanalysis.org/2025/03/11/r...!

Includes:
* Additional Gromacs/distopia/parallel analysis support
* New "water"/"precision" keywords

🙏 to the release's 10 contributors (3 new), and to @numfocus.bsky.social/@chanzuckerberg.bsky.social for support.
Release 2.9.0 of MDAnalysis · MDAnalysis
www.mdanalysis.org
March 17, 2025 at 7:43 AM
Reposted by Egor Marin
MDAnalysis is participating in Google Summer of Code 2025! Do you like #coding to solve problems in #biophysics #chemistry or #materials?

Read our blog on how to apply: www.mdanalysis.org/2025/02/28/g.... Pre-proposals are due March 21st.

We value transparency; please communicate in public forums.
March 3, 2025 at 8:17 AM
What @hekstralab.bsky.social are doing is what I've been missing a lot during my BSc, MSc and early PhD -- quality open-source crystallography with solid software foundation. I'm really excited every time I see a paper from you guys, even though I'm not doing crystallography myself anymore :)
Crystal structures are *not* God-given truth. They approximate, w/ flaws & errors, X-ray diffraction data. AlphaFold etc. have been trained on structures, not data. SFCalculator now differentiably connects structures to diffraction data. What does this enable? 🧵 1/4 www.biorxiv.org/content/10.1...
February 23, 2025 at 3:01 AM
Reposted by Egor Marin
Inventors of flow matching have released a comprehensive guide going over the math & code of flow matching!

Also covers variants like non-Euclidean & discrete flow matching.

A PyTorch library is also released with this guide!

This looks like a very good read! 🔥

arxiv: arxiv.org/abs/2412.06264
December 10, 2024 at 8:35 AM
Love the comparison with the linguistic laws (and surprised it took so long!)

I remember how the paper about Zipf law for small molecule's substructures actually blew my mind, this gets pretty close too :)
A nice analysis of different tokenization strategies (BPE, wordpiece, sentencepiece) on protein sequences.

arxiv.org/abs/2411.17669
December 3, 2024 at 2:14 PM
Reposted by Egor Marin
We released MDAnalysis 2.8.0 🚀 See the blog post www.mdanalysis.org/2024/11/22/r... . Highlights: (1) all code under the GNU Lesser General Public License, (2) new Guesser API, (3) general parallelization for analysis tools, (3) DSSP analysis class, (4) more MDAKits.
Release 2.8.0 of MDAnalysis · MDAnalysis
www.mdanalysis.org
November 27, 2024 at 5:33 PM
Ok, the other thing I'm actually really proud of (and that is fairly recent) is the paper with a lengthy title "Regression-Based Active Learning for Accessible Acceleration of Ultra-Large Library Docking": pubs.acs.org/doi/10.1021/...
Regression-Based Active Learning for Accessible Acceleration of Ultra-Large Library Docking
Structure-based drug discovery is a process for both hit finding and optimization that relies on a validated three-dimensional model of a target biomolecule, used to rationalize the structure–function relationship for this particular target. An ultralarge virtual screening approach has emerged recently for rapid discovery of high-affinity hit compounds, but it requires substantial computational resources. This study shows that active learning with simple linear regression models can accelerate virtual screening, retrieving up to 90% of the top-1% of the docking hit list after docking just 10% of the ligands. The results demonstrate that it is unnecessary to use complex models, such as deep learning approaches, to predict the imprecise results of ligand docking with a low sampling depth. Furthermore, we explore active learning meta-parameters and find that constant batch size models with a simple ensembling method provide the best ligand retrieval rate. Finally, our approach is validated on the ultralarge size virtual screening data set, retrieving 70% of the top-0.05% of ligands after screening only 2% of the library. Altogether, this work provides a computationally accessible approach for accelerated virtual screening that can serve as a blueprint for the future design of low-compute agents for exploration of the chemical space via large-scale accelerated docking. With recent breakthroughs in protein structure prediction, this method can significantly increase accessibility for the academic community and aid in the rapid discovery of high-affinity hit compounds for various targets.
pubs.acs.org
November 26, 2024 at 8:04 PM
Some things that I think are worth being told here -- there's a secondary structure analysis module in MDAnalysis now!

github.com/MDAnalysis/m...

It's been there for a while now but isn't still in a tagged version afaik, so you have to check it out manually to use.
Feature/dssp by marinegor · Pull Request #4304 · MDAnalysis/mdanalysis
Fixes #1612 Changes made in this Pull Request: introduces MDAnalysis.analysis.dssp.DSSP class for secondary structure analysis, using code implemented in pydssp package available for secondary str...
github.com
November 21, 2024 at 10:10 AM
Ok right off the bat -- any benchmarks about Chai / Boltz-1 / any other AF3-like models in the antibody complex prediction task?
November 20, 2024 at 11:20 PM
First time logging in in a month, and suddenly it seems that someone has spilled a mass-following script somewhere in the GPCR community :)

Anyway, hi everyone, I'm happy to (re)connect with everyone I know and don't know -- I'll post some old-but-gold things about myself soon, stay tuned!
November 20, 2024 at 11:12 PM
Reposted by Egor Marin
Do you know some Python but want to learn about analyzing and visualizing molecular simulation data? Join us Feb 28 at 3:00 UTC for a free workshop on a basic workflow with MDAnalysis and Molecular Nodes!

Spots are limited, so make sure to apply before Feb 19: www.mdanalysis.org/2024/02/05/m...
February 5, 2024 at 4:29 PM
Oh, look, that's me, writing parallelization code!

...and hi bsky, I guess?🤔
Looking for some good Friday reading? Check out the amazing work Xu Hong and Egor completed during the GSoC 2023 program:

www.mdanalysis.org/2024/01/18/g...

Also, please get in touch with us if you are interested in #mentoring with MDAnalysis for the GSoC 2024 program!
January 22, 2024 at 9:41 PM