Chris Mungall
cmungall.bsky.social
Chris Mungall
@cmungall.bsky.social
Berkeley Lab, Environmental Genomics and Systems Biology division. #GeneOntology #MonarchInitiative #AllianceGenome #NationalMicrobimeDataCollaborative #OBOFoundry.
Pinned
Wow, AlphaGenome is a huge deal, 1mb context windows, and prediction of a variety of features, with cell and tissue specificity! Read @anshulkundaje.bsky.social's excellent thread for the details. I want to additionally highlight one additional thing for my structured data nerd friends...
This a really exciting leap forward for genomic sequence to activity gene regulation models. It is a genuine improvement over pretty much all SOTA models spanning a wide range of regulatory, transcriptional and post-transcriptional processes. 1/
Excited to launch our AlphaGenome API goo.gle/3ZPUeFX along with the preprint goo.gle/45AkUyc describing and evaluating our latest DNA sequence model powering the API. Looking forward to seeing how scientists use it! @googledeepmind
Claude code in a loop plus some markdown files for skills and agents is all you need!
January 16, 2026 at 6:19 AM
Last year we made a CLI wrapper for different deep research APIs. As a baseline implementation we do a simple Claude Code in a loop. It works rather well!
Well, I discovered there is a name for this pattern: Ralph. We made a Ralph Wiggum deep researcher. monarch-initiative.github.io/deep-researc...
January 12, 2026 at 2:41 AM
Ontologically, ontogenetically, and phylogenetically, yes
December 20, 2025 at 7:29 PM
Reposted by Chris Mungall
📣 New preprint from us at phagefoundry.org 📣
A solid machine learning framework & to predict strain-level phage-host interactions across diverse bacterial genera from genome sequences alone. Avery Noonan from the Arkin Lab led this massive effort
www.biorxiv.org/content/10.1...
Phage Foundry
phagefoundry.org
November 16, 2025 at 5:58 PM
See the thread (from the original arXiv preprint) over on Mastodon: genomic.social/@Cmungall/11...
Chris Mungall (@Cmungall@genomic.social)
Attached: 1 image How can we scale up manual classification of chemical structures in databases like ChEBI? Can we help curators place new structures into classes like "terpenoid", based on their che...
genomic.social
October 3, 2025 at 2:58 PM
We developed and evaluated a method to learn python chemical structure classifiers using LLMs. These can give classifications+explanations at runtime. With @jannahastings.bsky.social @justaddcoffee.bsky.social Noel O'Boyle, Daniel Korn, Adnan Malik jcheminf.biomedcentral.com/articles/10....
Chemical classification program synthesis using generative artificial intelligence - Journal of Cheminformatics
Accurately classifying chemical structures is essential for cheminformatics and bioinformatics, including tasks such as identifying bioactive compounds of interest, screening molecules for toxicity to humans, finding non-organic compounds with desirable material properties, or organizing large chemical libraries for drug discovery or environmental monitoring. However, manual classification is labor-intensive and difficult to scale to large chemical databases. Existing automated approaches either rely on manually constructed classification rules, or are deep learning methods that lack explainability. This work presents an approach that uses generative artificial intelligence to automatically write chemical classifier programs for classes in the Chemical Entities of Biological Interest (ChEBI) database. These programs can be used for efficient deterministic run-time classification of SMILES structures, with natural language explanations. The programs themselves constitute an explainable computable ontological model of chemical class nomenclature, which we call the ChEBI Chemical Class Program Ontology (C3PO). We validated our approach against the ChEBI database, and compared our results against deep learning models and a naive SMARTS pattern based classifier. C3PO outperforms the naive classifier, but does not reach the performance of state of the art deep learning methods. However, C3PO has a number of strengths that complement deep learning methods, including explainability and reduced data dependence. C3PO can be used alongside deep learning classifiers to provide an explanation of the classification, where both methods agree. The programs can be used as part of the ontology development process, and iteratively refined by expert human curators.
jcheminf.biomedcentral.com
October 3, 2025 at 2:57 PM
Reposted by Chris Mungall
Hiding in plain sight - how close are we to mapping ALL 🧬enhancers🧬 in the genome?

Our new paper by Mannion et al. takes a systematic look at "hidden enhancers" and why they remain so hard to find. With @mosterwalder.bsky.social, @jlopezrios.bsky.social & many more

www.nature.com/articles/s41...
August 8, 2025 at 6:10 PM
One super pedantic minor ontological pet peeve is the use of the term "simulation", since that leads me to expect a agent-based or physics-style simulation of cell perturbations. But in fact this pattern could be used for those too! And I guess the terminological horse has long bolted here..
August 25, 2025 at 12:41 AM
But of course rBio is very cool independent of my nerdy obsession with FMs using ontologies/KGs! This general distillation pattern is likely to be very useful for integrating knowledge with the weights in massive omics FMs..
two dalek robots are standing next to each other in a room with the words 76totterslane above them
Alt: Dalek meme: Daleks saying "EX-PLAIN" (alluding to use of technique to make foundation models, the "daleks", more explainable)
media.tenor.com
August 25, 2025 at 12:41 AM
For another use of ontologies in genomic foundation models, see the recent AlphaGenome paper bsky.app/profile/cmun...
Wow, AlphaGenome is a huge deal, 1mb context windows, and prediction of a variety of features, with cell and tissue specificity! Read @anshulkundaje.bsky.social's excellent thread for the details. I want to additionally highlight one additional thing for my structured data nerd friends...
This a really exciting leap forward for genomic sequence to activity gene regulation models. It is a genuine improvement over pretty much all SOTA models spanning a wide range of regulatory, transcriptional and post-transcriptional processes. 1/
August 25, 2025 at 12:41 AM
Aside: I find that too many "defenses" of ontologies/KGs in the face of genAI fall back on a kind of GraphRAG use case, where the ontology/KG is used as some kind of bullwark against hallucination. Valid... but they can do so much more! Using as teacher in RL-loop on reasoner traces is v cool!
yoda from star wars is smoking a cigarette and says `` teach you i will ''
Alt: Yoda meme: "Teach you I will". Alluding to using the ontology as a "teacher" in the RL loop
media.tenor.com
August 25, 2025 at 12:41 AM
In order to fine tune the reasoner model, the authors used three kinds of soft verifiers in the RL loop - experimental (e.g. CRISPRi knockdown), "simulation" (e.g Transcriptformer), and knowledge-based. For knowledge-based, they used GO @geneontology.bsky.social!
August 25, 2025 at 12:41 AM
The applications of this are very interesting, allowing for interrogation in natural language, as well as background reasoning over the wealth of biology in the literature. So you can ask what happens to other genes if you knock down a gene in a cell type, and get a biological explanation
August 25, 2025 at 12:41 AM
Very exciting to see the research from @cziscience.bsky.social on the rBio distilling a black box foundation model (in this case a "virtual cell" perturbation model) into a smaller reasoner LLM. And it uses ontologies as part of RL! chanzuckerberg.com/blog/rbio-re...
rBio: Reasoning Model Trained on Virtual Cell Simulations
Scientists can ask complex biological questions in plain language and get predictions about gene interactions.
chanzuckerberg.com
August 25, 2025 at 12:41 AM
Reposted by Chris Mungall
The Alliance webinar for August is this Thursday (Aug 21, noon EDT), on Ontologies and the Alliance, presented by Chris Mungall. You can preregister for the zoom link here forms.gle/GzMnmwK23SzP...; please preregister by midnight EDT Wednesday Aug 20.
August 18, 2025 at 4:51 PM
I don’t have those details to hand, but this affects the 3 subcontract sites too…
August 15, 2025 at 9:47 PM
This is terrible news, not just for fly research, Drosophila is a key model organism that helps us understand shared biological pathways and the systems that underpin many human diseases 💔💔💔
FlyBase, a Drosophila database, will lose a third of its team in early October because the Harvard grant that covered the employees’ salaries was canceled. Scientists warn that losing FlyBase could devastate fly research.

By @claudia-lopez.bsky.social

www.thetransmitter.org/community/ha...
Harvard University lays off fly database team
The layoffs jeopardize this resource, which has served more than 4,000 labs for about three decades.
www.thetransmitter.org
August 15, 2025 at 3:37 PM
Reposted by Chris Mungall
FlyBase needs your help! We ask that European labs continue to contribute to Cambridge, UK FlyBase, whereas US and other non-European labs can contribute to US FlyBase. For more information and how to donate: wiki.flybase.org/wiki/FlyBase...
FlyBase:Contribute to FlyBase - FlyBase Wiki
wiki.flybase.org
August 15, 2025 at 12:45 PM
mültifaceted crüe
August 14, 2025 at 4:27 AM
Mild frustration and eye rolling against the machine
August 14, 2025 at 4:23 AM
Reposted by Chris Mungall
@cmungall.bsky.social‬ tackles complex #knowledgeManagement challenges in the life sciences with well-honed collaborative methods and AI-augmented computational tooling, streamlining #ontology creation and #knowledgeGraph building.

knowledgegraphinsights.com/chris-mungall/
Chris Mungall: collaborative knowledge graphs in the life sciences
Chris Mungall is an expert on building knowledge graphs for the life sciences with a wide variety of scientific collaborators.
knowledgegraphinsights.com
August 5, 2025 at 12:57 PM
Reposted by Chris Mungall
Exceptional Contributions to Biocuration - Lifetime Achievement Award winner: Ruth Lovering
Ruth has contributed extensively to the curation of key resources such as HGNC, Gene Ontology (GO), and IMEx, and has been instrumental in developing curation standards. Ruth is a past chair of the ISB EC.
July 30, 2025 at 4:04 PM
Reposted by Chris Mungall
Exceptional Contributions to Biocuration - Advanced Career Award winner: Kimberly Van Auken
Kimberly's career reflects expertise, sustained innovation, & dedicated service to community. She's contributed to many projects, including WormBase, the Gene Ontology, & the Alliance of Genome Resources.
July 30, 2025 at 4:04 PM
Reposted by Chris Mungall
Exceptional Contributions to Biocuration - Early Career Award winner: Tiago Lubiano.
Tiago's a passionate and motivated scientist interested in linked open data, ontologies, the semantic web, and their application in modeling cells and cell types. He is active in many curation projects & with ISB.
July 30, 2025 at 4:04 PM